Overview Before you begin Run Ubuntu Linux on the HiKey 960 Build an Ubuntu filesystem Flash the base firmware and OS Flash the base firmware and OS - recovery mode Flash the base firmware and OS - fastboot mode Boot Linux Add more diskspace MNIST Draw MNIST Draw - Setup MNIST Draw - Machine Learning model MNIST Draw - MNIST demo application Streamline Streamline - Run the MNIST inference Streamline - Use Streamline to connect and profile the application Streamline - Automate the launch and capture Next steps
Streamline
Streamline is a performance analyzer for software running on Arm processors. It can be downloaded and installed from Arm Developer. There are thirty-day trials available for those needing a license.
Before profiling the application with Streamline, let’s look at the code in the mnist-demo application. Open either mnist_tf_convol.cpp or mnist_tf_simple.cpp.
Here are the steps to follow:
- Load and parse the MNIST data
- Import the Tensorflow graph
- Optimize for a specific compute device
- Run the graph
The helper function in mnist_loader.hpp scans the file in the dataset and returns a MnistImage struct with two fields: the label and an array of pixel values.
You can import a Tensorflow graph from both text and binary Protobuf formats.
Importing a graph consists of binding the input and output points of the model graph.
You can find these points by visualizing the model in Tensorboard.
Note: After this step, the code is common regardless of the framework that you started with.
Arm NN supports optimization for both CPU and GPU devices.
It is easy to specify the device when creating the execution runtime context in the code.
Running the inference on the chosen compute device is performed through the EnqueueWorkload() function of the context object.
The result of the inference can be read directly from the output array and compared to the MnistImage label that we read from the data file.