Arm Machine Learning Processor 

Industry-leading performance and efficiency for inference at the edge.

Machine Learning Processor Block Diagram.

The Arm Machine Learning processor is an optimized, ground-up design for machine learning acceleration, targeting mobile and adjacent markets. The solution consists of state-of-the-art optimized fixed-function engines that provide best-in-class performance within a constrained power envelope.

Additional programmable layer engines support the execution of non-convolution layers, and the implementation of selected primitives and operators, along with future innovation and algorithm generation. The network control unit manages the overall execution and traversal of the network and the DMA moves data in and out of the main memory.

Onboard memory allows central storage for weights and feature maps, thus reducing traffic to the external memory and therefore, power.


Key Features

  • Specially designed to provide outstanding performance for mobile with up to 4 TOPs; additional optimizations provide a further increase in real-world use cases.
  • Best-in-class efficiency at >4 TOPs/W.
  • Programmable layer engines for future-proofing.
  • Highly tuned for advanced geometry implementations.
  • Scalable onboard memory reduces external memory traffic.
  • Arm NN works with Android NNAPI to provide a translation layer between major neural network frameworks, such as TensorFlow and Caffe, and the Arm Machine Learning processor, as well as other Arm IP.

Performance

  • Greater than 4 TOPs in mobile environments.
  • Propriety optimizations provide further increase in real-world use cases.
  • Efficiency of >4 TOPs/W.

Find out more

Find out more about Arm Machine Learning processor

Contact us

To learn more about Machine Learning on Arm, visit our ML Developer community.

Learn more

 


Key Benefits

  • Most efficient solution to run neural networks.
  • Designed for the mobile and adjacent markets.
  • Optimized, ground-up design for machine learning acceleration.
  • Best-in-class performance with state-of-the-art, fixed-function engines.
  • Programmable engines for future innovation and algorithms.
  • Massive efficiency uplift from CPUs, GPUs, DSPs and accelerators.
  • Completes Arm’s heterogeneous Machine Learning platform solution.
  • Enabled by open-source software.
  • Industry-leading performance in thermally- and cost-constrained environments.

Applications

Mobile

AR/VR

IoT

Smart camera

Healthcare

Medical

Logistics

Small area

Robotics

Home

Consumer 

Drones

Wearables

Webinar - Project Trillium: Optimizing ML Performance for any Application

Project Trillium is a suite of Arm IP designed to deliver scalable ML and neural network functionality at any point on the performance curve, from sensors, to mobile, and beyond. 

 

Find out more

Community Blogs

Community Forums

Not answered M0 Synthesis Power Report
  • Cortex-M0
  • DesignStart
0 votes 16 views 0 replies Started 10 hours ago by Nacho Renteria Answer this
Suggested answer M0+ Stack Pointer (PSP/MSP) Clarification
  • Cortex-M0
  • R13 (SP Stack Pointer)
  • cortex-m0+
0 votes 295 views 9 replies Latest 17 hours ago by Sean Dunlevy Answer this
Not answered Removal of WID's in AMBA AXI4 0 votes 38 views 0 replies Started yesterday by mvenkatesh Answer this
Not answered Arm keil4 optimization 0 votes 37 views 0 replies Started yesterday by Wenchuan2018 Answer this
Not answered Hi folks, anyone got any idea on which compiler to use in Qemu for working with 64bit Arm Architecture? Complete noob here 0 votes 42 views 0 replies Started 3 days ago by Kallooran Answer this
Not answered Is the model debugger for fast model free?
  • Fast Models
  • Fixed Virtual Platforms (FVPs)
0 votes 53 views 0 replies Started 3 days ago by sukey Answer this
Not answered M0 Synthesis Power Report Started 10 hours ago by Nacho Renteria 0 replies 16 views
Suggested answer M0+ Stack Pointer (PSP/MSP) Clarification Latest 17 hours ago by Sean Dunlevy 9 replies 295 views
Not answered Removal of WID's in AMBA AXI4 Started yesterday by mvenkatesh 0 replies 38 views
Not answered Arm keil4 optimization Started yesterday by Wenchuan2018 0 replies 37 views
Not answered Hi folks, anyone got any idea on which compiler to use in Qemu for working with 64bit Arm Architecture? Complete noob here Started 3 days ago by Kallooran 0 replies 42 views
Not answered Is the model debugger for fast model free? Started 3 days ago by sukey 0 replies 53 views