You copied the Doc URL to your clipboard.

7.2. NEON and Floating-Point architecture

The contents of the NEON registers are vectors of elements of the same data type. A vector is divided into lanes and each lane contains a data value called an element.

The number of lanes in a NEON vector depends on the size of the vector and the data elements in the vector.

Usually, each NEON instruction results in n operations occurring in parallel, where n is the number of lanes that the input vectors are divided into. There cannot be a carry or overflow from one lane to another. Ordering of elements in the vector is from the least significant bit. This means that element 0 uses the least significant bits of the register.

NEON and floating-point instructions operate on elements of the following types:

  • 32-bit single precision and 64-bit double precision floating-point.


    16-bit floating-point is supported, but only as a format to be converted from or to. It is not supported for data processing operations.

  • 8-bit, 16-bit, 32-bit, or 64-bit unsigned and signed integers.

  • 8-bit and 16-bit polynomials.

    The polynomial type is for code, such as error correction, that uses power-of-two finite fields or simple polynomials over {0,1}. Normal ARM integer code typically uses a lookup table for finite field arithmetic. AArch64 NEON provides instructions to use large lookup tables.

    Polynomial operations are hard to synthesize out of other operations, so it is useful having a basic multiply operation from which other, larger operations can be synthesized.

The NEON unit views the register file as:

32 × 128-bit quadword registers, V0-V31, each of which can be viewed as in Figure 7.1:

Figure 7.1. Divisions of the V register

Figure 7.1. Divisions of the V register

Thirty-two 64-bit D, or doubleword, registers, D0-D31, each of which can be viewed as in Figure 7.2:

Figure 7.2. Divisions of the D register

Figure 7.2. Divisions of the D register

All of these registers are accessible at any time. Software does not have to explicitly switch between them because the instruction used determines the appropriate view.

Was this page helpful? Yes No