This guide explains how to use Streamline to profile the AlexNet example application from the Compute Library for both the Raspberry Pi and the HiKey 960 board.

As Project Trillium, Arm’s Machine Learning (ML) platform, enables new applications and capabilities, it is important that you monitor your software's performance to ensure that algorithms and applications deliver excellent user experiences.

You can do this using Arm's Streamline, a performance analyzer that makes it easy to profile and optimize your software for running on Arm-based processors. 

AlexNet is a convolutional neural network (CNN) that performs image feature classification from a training set of 1000 images.

The Compute Library provides a way to address performance and portability challenges. It helps you to avoid re-writing applications for different target hardware and gives you confidence that lower-level functionality is optimized.

In this guide, we explain how to run the AlexNet example on two different hardware platforms. The first section covers Raspberry Pi 3 running Ubuntu MATE and the second section covers the HiKey 960 development platform running Android AOSP. The Raspberry Pi 3 contains four Arm Cortex-A53 cores, and the Hikey 960 is based on an Arm big.LITTLE processor with four ARM Cortex-A73 and four Cortex-A53 cores.

Running on two different platforms demonstrates the flexibility of the Compute Library. This guide also provides tips on setting up each platform, and uses Streamline to show differences in how the Compute Library runs on different hardware.

Continue on through this guide to starting learning how you can profile on Raspberry Pi, or if you want to jump straight to how to profile for the Hikey 960, go to Install and build Compute Library on HiKey 960.