Specifications

Balanced inference efficiency and performance

Optimized for the most cost- and power-sensitive designs, Ethos-N57 delivers premium AI experiences in mainstream phones and DTVs. With the highest performance, an open-source software framework and the largest AI ecosystem, the Arm AI platform makes it easy to develop on Arm.

Arm Ethos-N Block Diagram

Ethos-N57 mainstream ML inference processor contains eight compute engines

Key features Performance (at 1GHz)
2 TOP/s
MACs (8x8) 1024
Data types Int-8 and Int-16
Network support CNN and RNN
Efficient convolution Winograd support
Sparsity Yes
Secure mode TEE or SEE
Multicore capability 8 NPUs in a cluster
64 NPUs in a mesh
Memory system Embedded SRAM 512 KB
Bandwidth reduction
Extended compression technology, layer/operator fusion
Main interface
1xAXI4 (128-bit), ACE-5 Lite
Development platform Neural frameworks TensorFlow, TensorFlow Lite, Caffe2, PyTorch, MXNet, ONNX
Neural operator API Arm NN, AndroidNN
Software components Arm NN, neural compiler, driver and support library
Debug and profile Layer-by-layer visibility
Evaluation and early prototyping Arm Juno FPGA systems and cycle models


Key features

Balanced Performance
Delivers up to 2 TOP/s of performance using 1024 8-Bit MACs. 

Optimized Design
Drives up to 225% convolution performance uplift using Winograd on 3x3 kernels, delivering up to 90% MAC utilization.

High Efficiency
Internally distributed SRAM stores data close to the compute elements to save power and reduce DRAM access.

Futureproof
Supports a wide range of existing Machine Learning (ML) operations and future innovations through firmware updates and compiler technology.


Key benefits

The key benefits of Ethos-N57:

  • Supports a variety of popular neural networks, including CNNs and RNNs, for classification, object detection, image enhancements, speech recognition and natural language understanding
  • Reduces system memory bandwidth by 1.5-3x through clustering sparsity and workload tiling, with lossless compression for weights and activations on select networks
  • Maximizes the number of parameters stored on-chip by storing compressed weights and activations in local SRAM and decompressing them on the fly
  • Leverages sparse power gating techniques to reduce power by up to 50%
  • Improves performance and extends battery life through intelligent data management techniques to minimize memory movement with up to 90% of accesses on chip
  • Supports TrustZone system security to safeguard sensitive data with support for secure and non-secure modes

Ethos-N comparison

    Ethos-N78
Ethos-N77
Ethos-N57
Ethos-N37
Key features Performance (at 1GHz)
10. 5. 2. 1 TOP/s 4 TOP/s 2 TOP/s 1 TOP/s
MAC/Cycle (8x8) 4096, 2048, 1024, 512 2048 1024
512
Efficient convolution
Winograd support delivers 2.25x peak performance over baseline
Configurability 90+ design options Single product offering
Network support CNN and RNN
Data types
Int-8 and Int-16
Sparsity Yes
Secure mode
TEE or SEE
  Multicore capability 8 NPUs in a cluster
64 NPUs in a mesh
Memory system Embedded SRAM 384KB – 4MB 1-4 MB 512 KB 512 KB
Bandwidth reduction Enhanced compression Extended compression technology, layer/operator fusion, clustering, and workload tilling
Main interface 1xAXI4 (128-bit), ACE-5 Lite
Development platform Neural frameworks TensorFlow, TensorFlow Lite, Caffe2, PyTorch, MXNet, ONNX
  Inference deployment Ahead of time compiled with TVM
Online interpreted with Arm NN
Android Neural Networks API (NNAPI)
  Software components Arm NN, compiler and support library, driver
  Debug and profile Heterogeneous layer-by-layer visibility in Arm Development Studio Streamline
  Evaluation and early prototyping Ethos-N Static Performance Analyzer (SPA), Arm Juno FPGA systems, Cycle Models