You copied the Doc URL to your clipboard.

Intrinsics

The intrinsics described in this topic map closely to NEON instructions. Each topic begins with a list of function prototypes, with a comment specifying an equivalent assembler instruction. The compiler selects an instruction that has the required semantics, but there is no guarantee that the compiler produces the listed instruction.

The intrinsics use a naming scheme that is similar to the NEON unified assembler syntax. That is, each intrinsic has the form:

<opname><flags>_<type>

An additional q flag is provided to specify that the intrinsic operates on 128-bit vectors.

For example:

  • vmul_s16, multiplies two vectors of signed 16-bit values.

    This compiles to VMUL.I16 d2, d0, d1.

  • vaddl_u8, is a long add of two 64-bit vectors containing unsigned 8-bit values, resulting in a 128-bit vector of unsigned 16-bit values.

    This compiles to VADDL.U8 q1, d0, d1.

Registers other than those specified in these examples might be used. In addition, the compiler might perform optimization that in some way changes the instruction that the source code compiles to.

Note

The intrinsic function prototypes in this topic use the following type annotations:

__const(n)

the argument n must be a compile-time constant

__constrange(min, max)

the argument must be a compile-time constant in the range min to max

__transfersize(n)

the intrinsic loads n lanes from this pointer.

Note

The NEON intrinsic function prototypes that use __fp16 are only available for targets that have the NEON half-precision VFP extension. To enable use of __fp16, use the --fp16_format command-line option. See --fp16_format=format.

The intrinsics are grouped into:

Was this page helpful? Yes No