Arm Ethos-U System Design

  • Delivery method: Face-to-face (Private)

  • Location: Any location

  • Course Length: 3-4 days face to face or 6-8 1/2 days virtual classroom

  • Technology Focus: Combined Hardware and Software

  • Cost: Contact us for pricing

  • Provider: Arm



The Arm Ethos-U processor series training courses are designed to help embedded engineers working on new or existing Arm Ethos-U designs. Whether you’re working on design, verification or validation, for an Arm Ethos-U system, the course can be configured according to your team’s needs. 

Courses include fundamental topics to enable a solid platform of understanding. The rest of the course then builds on from this with optional topics and can be tailored appropriately. Some key topics are delivered via pre course on-demand video.

A pre course call with the engineer delivering the training will help you discuss your team’s individual training requirements.

At the end of this course, delegates will be able to

  • Describe the Ethos-U Neural Processor Unit (NPU) main functions and supported API
  • Explain the Ethos-U data flow and block architecture
  • Describe the physical model integration and the interface signals
  • Configure and integrate the Ethos-U into their SoC
  • Run the supplied test cases
  • Describe the Ethos-U NPU functional model
  • Describe Arm Machine Leaning SW architecture
  • Understand Ethos-U driver stack, Tensorflow Lite Micro and CMSIS-NN
  • Describe the Embedded "Offline" tooling and Online Memory Management

Course Length

Delivery Method


3-4  Days


Virtual or Onsite


Engineers working on a SoC project using Ethos-U NPU and carrying out System Design or verification.


  • A working knowledge of RTL design
  • Machine Learning using Arm Online Training Module
  • Background knowledge about Machine Learning

Related Products

Arm Ethos-U55, Ethos-U65


Agendas will be created from the following list of fundamental and optional topics

SW Topics

HW Topics

Ethos SW Flow

  • Software stack overview
  • Integration with subsystem
  • Cortex-A interfacing (Cortex U65 only)
  • Device flow (offline and online)
  • Integration overview with TFLu
  • Inference Flow for Cortex-A --> Cortex-M/Yoda Sub-system (Cortex U65 only)

Tensorflow Lite Micro and CMSIS-NN

  • Overview of Tensorflow Lite micro
  • Tensorflow Lite micro Architecture
  • Overview of CMSIS-NN
  • TFLµ integration with CMSIS-NN and Ethos-U
  • Building process of Tensorflow lite micro for Ethos-U

Ethos Offline/Online Memory Management

  • Vela Memory management
  • Memory management in depth

Model Conditioning

  • Model Development Flow
  • Post-training optimizations
  • Training-time optimizations

Vela In depth

  • Understand Vela internals
  • Vela CLI options
  • Vela output options

Ethos Driver:

  • Repositories
  • Building
  • Architecture
  • Upstream process

Model integration into a fast model environment*

Ethos Hardware Overview

  • Endpoint devices
  • Embedded ML software and tools
  • Ethos-U55 hardware and configurations
  • Typical Ethos-U55 data flow
  • Ethos-U55 performance
  • A use-case analysis
  • Ethos-U55 evaluation platform targets

Ethos Architecture

  • Functional Block
  • Computation units of work
  • Control and Data flow
  • Programmers model
  • Debug and Trace
  • Security

Ethos HW integration

  • Getting Started
  • Implementation process flow
  • Configuration
  • Physical Model integration
  • Clock and resets
  • Interface signals
  • Memory system considerations
  • Testbenches and Integration Kit

Ethos-U NPU Functional Model



Optional Topics

  • Difference between Ethos-U processors
  • Machine Learning using Arm*, Online introduction to Machine Learning using Arm Products
  • ArmNN – How to build and structure ArmNN and how to run Caffe Model
  • Compute Library Introduction, Compiling, structuring as well as Gaussian Pyramid, Integral Image, SGEMM and Winograd optimisation


* = Online and on-demand.

Download PDF Version