vbfdot_laneq_f32
SIMD ISA | Return Type | Name | Arguments | Instruction Group | |
---|---|---|---|---|---|
Neon | float32x2_t | vbfdot_laneq_f32 | (float32x2_t r, bfloat16x4_t a, bfloat16x8_t b, const int lane) | Vector arithmetic / Dot product | |
Description BFloat16 floating-point dot product (vector, by element). This instruction delimits the source vectors into pairs of 16-bit BF16 elements. Each pair of elements in the first source vector is multiplied by the specified pair of elements in the second source vector. The resulting single-precision products are then summed and added destructively to the single-precision element of the destination vector that aligns with the pair of BF16 values in the first source vector. Results Vd.2S result This intrinsic compiles to the following instructions: BFDOT Argument Preparation r register: Vd.2Sa register: Vn.4Hb register: Vm.8Hlane minimum: 0; maximum: 3 Architectures A32, A64 Operation
|
Copyright © 1995-2025 Arm Limited (or its affiliates). All rights reserved.