You copied the Doc URL to your clipboard.


Floating-point down convert and narrow to BFloat16 (top, predicated).

Convert active 32-bit single-precision elements from the source vector to BFloat16 format, and place the results in the odd-numbered 16-bit elements of the destination vector, leaving the even-numbered elements unchanged. Inactive elements in the destination vector register remain unmodified.

Unlike the BFloat16 matrix multiplication and dot product instructions, this instruction honors all of the FPCR bits that apply to single-precision arithmetic. It can also generate a floating-point exception that causes cumulative exception bits in the FPSR to be set, or a synchronous exception to be taken, depending on the enable bits in the FPCR.

ID_AA64ZFR0_EL1.BF16 indicates whether this instruction is implemented.



BFCVTNT <Zd>.H, <Pg>/M, <Zn>.S

if !HaveSVE() || !HaveBF16Ext() then UNDEFINED;
integer g = UInt(Pg);
integer n = UInt(Zn);
integer d = UInt(Zd);

Assembler Symbols


Is the name of the destination scalable vector register, encoded in the "Zd" field.


Is the name of the governing scalable predicate register P0-P7, encoded in the "Pg" field.


Is the name of the first source scalable vector register, encoded in the "Zn" field.


integer elements = VL DIV 32;
bits(PL) mask = P[g];
bits(VL) operand  = Z[n];
bits(VL) result = Z[d];

for e = 0 to elements-1
    bits(32) element = Elem[operand, e, 32];
    if ElemP[mask, e, 32] == '1' then
        Elem[result, 2*e+1, 16] = FPConvertBF(element, FPCR);

Z[d] = result;