Performance is key when running Machine Learning on endpoint devices. Depending on the application, use case, or underlying Arm-based hardware, developers need to tailor solutions to their unique needs.

The following guides enable developers to get the best performance from the supporting software right through the development lifecycle. These guides instruct the developer, step by step, how to build the software for the target platform, profile the performance in real time, and apply various optimization techniques to help deliver the optimum solution.

Black outlined mainstream packages application icon

SDK configuration

For maximum control, build the Arm software environment from source and configure as desired.

Configure the Arm NN SDK build environment
View the guide

Cross-compile Arm NN for Raspberry Pi
View the guide

Black outlined software models application icon


Profiling the execution of a neural network at runtime using the Arm Streamline tool to provide key insights to the behavior of the software.

Run and profile Arm NN on the Raspberry Pi
View the guide

Profile AlexNet on Raspberry Pi and HiKey 960
View the guide

Optimize resources black outlined icon


Optimize the neural network or configure the software to achieve better ML results

Optimize TensorFlow models for mobile and embedded devices
View the guide

Speed up Arm NN performance using FP16 and Fast Math
View the guide

Black outlined innovation application icon

Best practice

Follow recommended best practices to achieve optimum results

Explore an MCU-friendly face recognition model
View the guide

Other guides