Arm Scalable Vector Extensions and application to Machine Learning

Download this white paper to learn how vector-length-agnostic techniques introduced by SVE can be used to efficiently vectorize General Matrix Multiplication (GEMM) and low precision GEMM (GEMMlowp).

Download white paper

About SVE

SVE is a vector extension for AArch64 execution mode for the A64 instruction set of the Armv8 architecture. Unlike other SIMD architectures, SVE does not define the size of the vector registers, but constrains it to a range of possible values, from a minimum of 128 bits up to a maximum of 2048 in 128-bit wide units. Therefore, any CPU vendor can implement the extension by choosing the vector register size that better suits the workloads the CPU is targeting. The design of SVE guarantees that the same program can run on different implementations of the ISA without the need to recompile the code.

Most of the instructions of the extension also use predicate registers to mask the lanes for operating on partial vectors. The SVE instruction set also provides gather loads and scatter stores, plus truncating stores, and signed/unsigned extended loads.

Learn more about SVE