You copied the Doc URL to your clipboard.

Half-precision floating-point number format

Arm® Compiler supports the half-precision floating-point __fp16 type.

Half-precision is a floating-point format that occupies 16 bits. Architectures that support half-precision floating-point numbers include:

  • The Armv8 architecture.
  • The Armv7 FPv5 architecture.
  • The Armv7 VFPv4 architecture.
  • The Armv7 VFPv3 architecture (as an optional extension).

If the target hardware does not support half-precision floating-point numbers, the compiler uses the floating-point library fplib to provide software support for half-precision.

Note

The __fp16 type is a storage format only. For purposes of arithmetic and other operations, __fp16 values in C or C++ expressions are automatically promoted to float.

Half-precision floating-point format

Arm Compiler uses the half-precision binary floating-point format defined by IEEE 754r, a revision to the IEEE 754 standard:

Figure 6-1 IEEE half-precision floating-point format


Where:

   S (bit[15]):      Sign bit
   E (bits[14:10]):  Biased exponent
   T (bits[9:0]):    Mantissa.

The meanings of these fields are as follows:

IF E==31:
   IF T==0: Value = Signed infinity
   IF T!=0: Value = Nan
             T[9] determines Quiet or Signalling:
                  0: Quiet NaN
                  1: Signalling NaN
IF 0<E<31:
   Value = (-1)^S x 2^(E-15) x (1 + (2^(-10) x T))
IF E==0:
   IF T==0: Value = Signed zero
   IF T!=0: Value = (-1)^S x 2^(-14) x (0 + (2^(-10) x T))

Note

See the Arm® C Language Extensions for more information.
Was this page helpful? Yes No