Profiling AlexNet on Raspberry Pi and HiKey 960 with the Arm Compute LibraryOverview Set up your Raspberry Pi NFS on Pi Build the Arm Compute Library on Pi Run the graph_alexnet application on Pi Start Streamline gatord on Pi Add Streamline annotations and rebuild on Pi Build the Arm Compute Library on HiKey 960 Profile with Streamline on HiKey 960 Next steps
Build the Arm Compute Library on HiKey 960
Running the AlexNet example on the HiKey 960 with Android provides a comparison to the Raspberry Pi.
This article on Profiling Android with the HiKey 960 provides information on how to get Android running on the Hikey 960. The only change is the latest AOSP release on Linaro is newer than it was at the time the article was written.
- Install Streamline and gatord. As with the Pi, using
gatordwith the perf API is recommended. Gator can be compiled from GitHub using NDK or copied from the DS-5 directory as with the Pi, as shown here. To do this, copy Gatord from the the DS-5 directory. Note that the path is different from the Pi, for 64-bit.
$ sudo adb remount< $ sudo adb push $DS5_HOME/sw/streamline/bin/arm64/gatord /system $ sudo adb root $ sudo adb shell # cd /system # chmod +x gatord # ./gatord &
- Install the Android NDK. This can be added to Android SDK or used standalone. For Android, the Arm Compute Library is cross compiled using the Android NDK. Once compiled, the examples are copied to the HiKey 960 using the
- The Arm Compute Library should be compiled with Clang as gcc is no longer supported. Setup Clang by generating a standalone toolchain from the NDK with these commands:
$ export NDK=/home/<user-name>/Android/Sdk/ndk-bundle $ $NDK/build/tools/make_standalone_toolchain.py --arch arm64 --api 23 --stl gnustl --install-dir /ml/cl/toolchains/aarch64
This creates a standalone toolchain in the specified installation directory.
- Next, add the
bin/directory of the toolchain to the
PATHenvironment variable and compile the Arm Compute Library for Android:
$ export PATH=/ml/cl/toolchains/aarch64/bin:$PATH $ git clone https://github.com/Arm-software/ComputeLibrary.git $ cd ComputeLibrary $ CXX=clang++ CC=clang scons Werror=0 debug=1 asserts=0 neon=1 opencl=1 os=android arch=arm64-v8a -j8
Note: We tested with NDK version r16b which is newer than the r14 version that the Arm Compute Library was tested with. We had only one warning which blocked the compilation with
Werror=1, but we submitted a patch which has already been incorporated into the Arm Compute Library.
- Once the build is complete, copy the examples to the HiKey 960 using
$ sudo adb push build/examples/graph_alexnet /data/local/tmp
- Copy the data needed by the example, adjust the path to the downloaded .zip file as needed, unzip it on the host machine, and copy the data to the HiKey 960 using
$ unzip compute_library_alexnet.zip -d assets_alexnet $ sudo ./adb push ./assets_alexnet /data/local/tmp
- The Android version of the Arm Compute Library and the examples are statically linked for all libraries except
libOpenCL.so. This means that unlike the Raspberry Pi, the
.sofiles in the
build/directory such as
libarm_compute_core.soare not needed on the target system, but the OpenCL library is required. To run the example, create a
libOpenCL.soas shown here:
$ cp /system/lib64/egl/libGLES_mali.so /data/local/tmp/libOpenCl.so $ export LD_LIBRARY_PATH=/data/local/tmp
If this is not done correctly, the resulting error will be:
CANNOT LINK EXECUTABLE "./graph_alexnet": library "libOpenCL.so" not found