You copied the Doc URL to your clipboard.

BFCVTNT

Floating-point down convert and narrow to BFloat16 (top, predicated).

Convert active 32-bit single-precision elements from the source vector to BFloat16 format, and place the results in the odd-numbered 16-bit elements of the destination vector, leaving the even-numbered elements unchanged. Inactive elements in the destination vector register remain unmodified.

Unlike the BFloat16 matrix multiplication and dot product instructions, this instruction honors all of the FPCR bits that apply to single-precision arithmetic. It can also generate a floating-point exception that causes cumulative exception bits in the FPSR to be set, or a synchronous exception to be taken, depending on the enable bits in the FPCR.

ID_AA64ZFR0_EL1.BF16 indicates whether this instruction is implemented.

313029282726252423222120191817161514131211109876543210
0110010010001010101PgZnZd

SVE

BFCVTNT <Zd>.H, <Pg>/M, <Zn>.S

if !HaveSVE() || !HaveBF16Ext() then UNDEFINED;
integer g = UInt(Pg);
integer n = UInt(Zn);
integer d = UInt(Zd);

Assembler Symbols

<Zd>

Is the name of the destination scalable vector register, encoded in the "Zd" field.

<Pg>

Is the name of the governing scalable predicate register P0-P7, encoded in the "Pg" field.

<Zn>

Is the name of the first source scalable vector register, encoded in the "Zn" field.

Operation

CheckSVEEnabled();
integer elements = VL DIV 32;
bits(PL) mask = P[g];
bits(VL) operand  = Z[n];
bits(VL) result = Z[d];

for e = 0 to elements-1
    bits(32) element = Elem[operand, e, 32];
    if ElemP[mask, e, 32] == '1' then
        Elem[result, 2*e+1, 16] = FPConvertBF(element, FPCR<31:0>);

Z[d] = result;