Arm Machine Learning Processor 

Industry-leading performance and efficiency for inference at the edge.

Based on a new, class-leading architecture, the Arm ML processor's optimized design enables new features, enhances user experience and delivers innovative applications for a wide array of market segments including mobile, IoT, embedded, automotive, and infrastructure. It provides a massive uplift in efficiency compared to CPUs, GPUs and DSPs through efficient convolution, sparsity and compression. 

Download the datasheet

Key Features

  • Specially designed to provide outstanding performance for mobile with 4 TOP/s and efficiency of 5 TOP/W; additional optimizations provide a further increase in real-world use cases.
  • Programmable layer engines for future-proofing.
  • Incorporates a variety of compression technologies to minimize system memory bandwidth.
  • Highly tuned for advanced geometry implementations.
  • Supports secure operating mode to protect DNN IP and data.
  • High responsiveness reduces latency to improve user experience.
  • Supports TrustZone system security for secure operating mode and configurable secure queues for multiple users, flexible processing in the TEE or SEE for secure cases like biometric payment, protecting content for high-value media streams.
 
 

Key Benefits

  • Enables ML processing on the edge, saving power, reducing data consumption and enhancing user privacy.
  • Flexible design supports a variety of popular neural networks, including CNNs and RNNs, for classification, object detection, image enhancements, speech recognition and natural language understanding.
  • Winograd accelerates common filters by 225% compared to other NPUs, allowing more performance in less area.
  • Minimizes system memory bandwidth by 1.5-3x through a variety of compression technologies, targeting both weight and activation feature maps.
  • Tight system integration through ACE-Lite master port and optional SMMU integration allows for support and protection of memory and easy handling of multiple users.
  • The Arm Machine Learning processor is compatible with Arm NN, an inference engine for CPUs, GPUs and NPUs that bridges the gap between existing NN frameworks and the underlying IP.
Machine Learning Processor Block Diagram.

Find out more

Find out more about Arm Machine Learning processor

Contact us

To learn more about Machine Learning on Arm, visit our ML Developer community.

Learn more

 


Specifications

Key Features Performance
(at 1Ghz)
4 TOP/s

Data Types
Int-8 and Int-16

Network Support
CNN and RNN

Efficient Convolution Winograd support

Sparsity Yes

Secure Mode TEE or SEE

Multi-Core Capability 8 NPUs in a cluster
64 NPUs in a mesh
Memory System Embedded SRAM 1MB

Bandwidth Reduction Extended compression technology, layer/operator fusion

Main Interface 1xAXI4 (128-bit), ACE-5 Lite
Development Platform Neural Frameworks TensorFlow, TensorFlow Lite, Caffe2, PyTorch, MXNet, ONNX

Neural Operator API ArmNN, AndroidNN

Software Components ArmNN, neural compiler, driver and support library

Debug and Profile Layer-by-layer visibility

Evaluation and Early Prototyping Arm Juno FPGA systems and cycle models

 


Applications

Mobile

AR/VR

IoT

Smart camera

Healthcare

Medical

Logistics

Start of internet connection.

STB/DTV

Robotics

Home

Consumer 

Drones

A stack of servers.

Infrastructure

Get Support

Arm Support

Arm training courses and on-site system-design advisory services enable licensees to efficiently integrate the Arm ML processor into their design to realize maximum system performance with lowest risk and fastest time-to-market.

Arm training courses  Arm Design Reviews  Open a support case

Community Blogs

Community Forums

Not answered gicv3 aarch32 icc_hsre 0 votes 16 views 0 replies Started yesterday by PJ Nee Answer this
Suggested answer NVIC_EnableIRQ : enables only one interrupt at a time? 0 votes 48 views 3 replies Latest yesterday by Andy Neil Answer this
Suggested answer Is there a built-in ARM assembly instruction for the following problem?
  • MDK-Arm
  • Arm Assembly Language (ASM)
0 votes 39 views 1 replies Latest yesterday by Andy Neil Answer this
Suggested answer Is it possible to get keyboard input into an ARM Assembly program?
  • MDK-Arm
  • Keil Cortex-M Eval Boards
  • Keil
0 votes 31 views 1 replies Latest yesterday by Andy Neil Answer this
Suggested answer ARMv8 memory ordering
  • Cortex-A53
  • Armv8-A
0 votes 1045 views 6 replies Latest yesterday by roffelsen Answer this
Not answered Issue with synthesizing Cortex-M0 memory Verilog code 0 votes 19 views 0 replies Started yesterday by N5 Sensors Answer this
Not answered gicv3 aarch32 icc_hsre Started yesterday by PJ Nee 0 replies 16 views
Suggested answer NVIC_EnableIRQ : enables only one interrupt at a time? Latest yesterday by Andy Neil 3 replies 48 views
Suggested answer Is there a built-in ARM assembly instruction for the following problem? Latest yesterday by Andy Neil 1 replies 39 views
Suggested answer Is it possible to get keyboard input into an ARM Assembly program? Latest yesterday by Andy Neil 1 replies 31 views
Suggested answer ARMv8 memory ordering Latest yesterday by roffelsen 6 replies 1045 views
Not answered Issue with synthesizing Cortex-M0 memory Verilog code Started yesterday by N5 Sensors 0 replies 19 views