Arm Machine Learning Processor 

Industry-leading performance and efficiency for inference at the edge.

Machine Learning Processor Block Diagram.

The Arm Machine Learning processor is an optimized, ground-up design for machine learning acceleration, targeting mobile and adjacent markets. The solution consists of state-of-the-art optimized fixed-function engines that provide best-in-class performance within a constrained power envelope.

Additional programmable layer engines support the execution of non-convolution layers, and the implementation of selected primitives and operators, along with future innovation and algorithm generation. The network control unit manages the overall execution and traversal of the network and the DMA moves data in and out of the main memory.

Onboard memory allows central storage for weights and feature maps, thus reducing traffic to the external memory and therefore, power.


Key Features

  • Specially designed to provide outstanding performance for mobile with up to 4.1 TOPs; additional optimizations provide a further increase in real-world use cases.
  • Best-in-class efficiency at >3 TOP/ W.
  • Programmable layer engines for future-proofing.
  • Highly tuned for advanced geometry implementations.
  • Scalable onboard memory reduces external memory traffic.
  • Arm NN works with Android NNAPI to provide a translation layer between major neural network frameworks, such as TensorFlow and Caffe, and the Arm Machine Learning processor, as well as other Arm IP.

Performance

  • Greater than 4.1 TOPs in mobile environments.
  • Propriety optimizations provide further increase in real-world use cases.
  • Efficiency of >3 TOP/W.

Find out more

Find out more about Arm Machine Learning processor

Contact us

To learn more about Machine Learning on Arm, visit our ML Developer community.

Learn more

 


Key Benefits

  • Most efficient solution to run neural networks.
  • Designed for the mobile and adjacent markets.
  • Optimized, ground-up design for machine learning acceleration.
  • Best-in-class performance with state-of-the-art, fixed-function engines.
  • Programmable engines for future innovation and algorithms.
  • Massive efficiency uplift from CPUs, GPUs, DSPs and accelerators.
  • Completes Arm’s heterogeneous Machine Learning platform solution.
  • Enabled by open-source software.
  • Industry-leading performance in thermally- and cost-constrained environments.
  • When combined with the Arm Object Detection processor, provides highly efficient and optimized people detection.

Applications

Mobile

AR/VR

IoT

Smart camera

Healthcare

Medical

Logistics

Small area

Robotics

Home

Consumer 

Drones

Wearables

Webinar - Project Trillium: Optimizing ML Performance for any Application

Project Trillium is a suite of Arm IP designed to deliver scalable ML and neural network functionality at any point on the performance curve, from sensors, to mobile, and beyond. 

 

Find out more

Community Blogs

Community Forums

Suggested answer Push/Pop in Cortex A55 64bit mode
  • Cortex-A55
0 votes 57 views 1 replies Latest 18 hours ago by 42Bastian Schick Answer this
Answered Barrier after access to memory mapped register?
  • Cortex-A53
  • AArch64
0 votes 1226 views 9 replies Latest 3 days ago by dedoz Answer this
Not answered How to connect a ST-Link debugger to a Cortex-M1 design
  • xilinx
  • cortex-m1
  • Keil
  • DesignStart
  • debugger
0 votes 53 views 0 replies Started 3 days ago by Matic Obid Answer this
Not answered Write to flash memory with Trustzone active (armv8-M33)
  • Arm
  • trustzone for armv8-m
  • stm32
  • TrustZone
  • Cortex-M33
0 votes 155 views 0 replies Started 5 days ago by Simon Answer this
Suggested answer Pipeline and Reorder Buffer on Cortex A9 0 votes 130 views 1 replies Latest 7 days ago by a.surati Answer this
Not answered Internal Authenticate command about PL131 0 votes 74 views 0 replies Started 8 days ago by fgvffvvfg Answer this
Suggested answer Push/Pop in Cortex A55 64bit mode Latest 18 hours ago by 42Bastian Schick 1 replies 57 views
Answered Barrier after access to memory mapped register? Latest 3 days ago by dedoz 9 replies 1226 views
Not answered How to connect a ST-Link debugger to a Cortex-M1 design Started 3 days ago by Matic Obid 0 replies 53 views
Not answered Write to flash memory with Trustzone active (armv8-M33) Started 5 days ago by Simon 0 replies 155 views
Suggested answer Pipeline and Reorder Buffer on Cortex A9 Latest 7 days ago by a.surati 1 replies 130 views
Not answered Internal Authenticate command about PL131 Started 8 days ago by fgvffvvfg 0 replies 74 views