Program conventions

Macros

In order to use the intrinsics the Advanced SIMD architecture must be supported, and some specific instructions may or may not be enabled in any case. When the following macros are defined and equal to 1, the corresponding features are available:

  • __ARM_NEON
    • Advanced SIMD is supported by the compiler
    • Always 1 for AArch64
  • __ARM_NEON_FP
    • Neon floating-point operations are supported
    • Always 1 for AArch64
  • __ARM_FEATURE_CRYPTO
    • Crypto instructions are available.
    • Cryptographic Neon intrinsics are therefore available.
  • __ARM_FEATURE_FMA
    • The fused multiply-accumulate instructions are available.
    • Neon intrinsics which use these are therefore available.

This list is not exhaustive and further macros are detailed in the Arm C Language Extensions document.

Types

There are three major categories of data type available in arm_neon.h which follow these patterns:

baseW_t
Scalar data types
baseWxL_t
Vector data types
baseWxLxN_t
Vector array data types

Where:

  • base refers to the fundamental data type.
  • W is the width of the fundamental type.
  • L is the number of scalar data type instances in a vector data type, for example an array of scalars.
  • N is the number of vector data type instances in a vector array type, for example a struct of arrays of scalars.

Generally W and L are such that the vector data types are 64 or 128 bits long, and so fit completely into a Neon register. N corresponds with those instructions which operate on multiple registers at once.

In our earlier code we encountered an example of all three:

  • uint8_t
  • uint8x16_t
  • uint8x16x3_t

Functions

As per the Arm C Language Extensions, the function prototypes from arm_neon.h follow a common pattern. At the most general level this is:

ret v[p][q][r]name[u][n][q][x][_high][_lane | laneq][_n][_result]_type(args)

Be wary that some of the letters and names are overloaded, but in the order above:

ret
the return type of the function.
v
short for vector and is present on all the intrinsics.
p
indicates a pairwise operation. ([value] means value may be present).
q
indicates a saturating operation (with the exception of vqtb[l][x] in AArch64 operations where the q indicates 128-bit index and result operands).
r
indicates a rounding operation.
name
the descriptive name of the basic operation. Often this is an Advanced SIMD instruction, but it does not have to be.
u
indicates signed-to-unsigned saturation.
n
indicates a narrowing operation.
q
postfixing the name indicates an operation on 128-bit vectors.
x
indicates an Advanced SIMD scalar operation in AArch64. It can be one of b, h, s or d (that is, 8, 16, 32, or 64 bits).
_high
In AArch64, used for widening and narrowing operations involving 128-bit operands. For widening 128-bit operands, high refers to the top 64-bits of the source operand(s). For narrowing, it refers to the top 64-bits of the destination operand.
_n
indicates a scalar operand supplied as an argument.
_lane
indicates a scalar operand taken from the lane of a vector. _laneq indicates a scalar operand taken from the lane of an input vector of 128-bit width. ( left | right means only left or right would appear).
type
the primary operand type in short form.
args
the function's arguments.
Previous Next