You copied the Doc URL to your clipboard.

NEON intrinsics

NEON intrinsics map closely to NEON instructions.

The documentation for each intrinsic begins with a list of function prototypes, with a comment specifying an equivalent assembler instruction. The compiler selects an instruction that has the required semantics, but there is no guarantee that the compiler produces the listed instruction.

The intrinsics use a naming scheme that is similar to the NEON unified assembler syntax. That is, each intrinsic has the form:


The optional q flag specifies that the intrinsic operates on 128-bit vectors.

For example:

  • vmul_s16, multiplies two vectors of signed 16-bit values.

    This compiles to VMUL.I16 d2, d0, d1.

  • vaddl_u8, is a long add of two 64-bit vectors containing unsigned 8-bit values, resulting in a 128-bit vector of unsigned 16-bit values.

    This compiles to VADDL.U8 q1, d0, d1.

Registers other than those specified in these examples might be used. In addition, the compiler might perform optimization that in some way changes the instruction that the source code compiles to.


The intrinsic function prototypes in this documentation use the following type annotations:


The argument n must be a compile-time constant.

__constrange(min, max)

The argument must be a compile-time constant in the range min to max.


The intrinsic loads n lanes from this pointer.


The NEON intrinsic function prototypes that use __fp16 are only available for targets that have the NEON half-precision VFP extension. To enable use of __fp16, use the --fp16_format command-line option.