A software library for best-in-class machine learning performance on Arm

The Arm Compute Library is a collection of low-level machine learning functions optimized for Cortex-A CPU and Mali GPU architectures. The library provides ML acceleration on Cortex-A CPU through Neon, or SVE and acceleration on Mali GPU through Open CL.

The library is extremely flexible and allows developers to source machine learning functions individually or use as part of complex pipelines to accelerate their algorithms and applications. Enabled by Arm microarchitecture optimizations, the Arm Compute Library provides superior performance to other OSS alternatives and immediate support for new Arm technologies, for example, SVE2. The Arm Compute Library is open-source software available under a permissive MIT license.

Key features

  • Over 100 machine learning functions for CPU and GPU
  • Multiple convolution algorithms (GEMM, Winograd, FFT, and Direct)
  • Support for multiple data types: FP32, FP16, int8, uint8, BFloat16
  • Micro-architecture optimization for key ML primitives
  • Highly configurable build options


Arm Compute Library achieves superior performance through techniques such as Kernel Fusion, Fast Math enablement, and Texture utilization.

ML performance can be improved further by tuning the Arm Compute Library software for the intended workloads, for example, LWS Open CL tuning or by using GEMM optimized heuristics. The following diagram shows a mean comparison of the Arm NN performance improvements.

Arm NN performance improvements graph

Availability and usage

The Arm Compute Library is specifically designed for Arm-based architectures and has been deployed on over a billion devices. The Arm Compute Library is trusted by silicon vendors, OEMS, and ISVs across the globe to improve their products, and used today to power machine learning in smartphones, digital TVs, automotive systems, AR and VR products, smart cameras, and many more devices.

The Arm Compute Library is lightweight, configurable, flexible, and truly operating system agnostic with support for Android, Linux, or bare-metal applications. These traits enable the application developer to adopt machine learning functionality quickly, allowing them to focus on differentiation, and reduce the product time to market.

What makes Arm Compute Library unique?

Many available machine learning libraries are generic and platform agnostic, not meeting the performance needs of our partners. Arm partners select the Arm Compute Library due to the Arm-specific optimizations where everything is in one place with very targeted optimization for Arm 64-bit architectures.

Watch the testimonial video from Ampere about the Arm Compute Library.

Watch video