VCMLA (by element)

Vector Complex Multiply Accumulate (by element).

This instruction multiplies the complex numbers in the first source vector register by the specified complex number in the second source vector register, and adds the results to the corresponding complex numbers in the destination vector register. The number of complex numbers that can be stored in the source and the destination vector registers is calculated as the vector register size divided by the length of each complex number. Each complex number is represented in a SIMD&FP register as a pair of elements with the imaginary part of the number being placed in the more significant element, and the real part of the number being placed in the less significant element. Both real and imaginary parts of the source and the resulting complex number are represented as floating-point values.

None, one, or both of the two vector elements that are read from each of the numbers in the second source SIMD&FP register can be negated based on the rotation value:

The indexed element variant of this instruction is available for half-precision and single-precision number values. For this variant, the index value determines the position in the specified element of the second source vector register of the single source value that is multiplied with each of the complex numbers in the first source vector register.

Depending on settings in the CPACR, NSACR, and HCPTR registers, and the security state and mode in which the instruction is executed, an attempt to execute the instruction might be undefined, or trapped to Hyp mode. For more information see Enabling Advanced SIMD and floating-point support.

It has encodings from the following instruction sets: A32 ( A1 ) and T32 ( T1 ) .

A1
(ARMv8.3)

313029282726252423222120191817161514131211109876543210
11111110SDrotVnVd1000NQM0Vm

64-bit SIMD vector of half-precision floating-point (S == 0 && Q == 0)

VCMLA{<q>}.F16 <Dd>, <Dn>, <Dm>[<index>], #<rotate>

64-bit SIMD vector of single-precision floating-point (S == 1 && Q == 0)

VCMLA{<q>}.F32 <Dd>, <Dn>, <Dm>[0], #<rotate>

128-bit SIMD vector of half-precision floating-point (S == 0 && Q == 1)

VCMLA{<q>}.F16 <Qd>, <Qn>, <Dm>[<index>], #<rotate>

128-bit SIMD vector of single-precision floating-point (S == 1 && Q == 1)

VCMLA{<q>}.F32 <Qd>, <Qn>, <Dm>[0], #<rotate>

if !HaveFJCVTZSExt() then UNDEFINED; if Q == '1' && (Vd<0> == '1' || Vn<0> == '1') then UNDEFINED; d = UInt(D:Vd); n = UInt(N:Vn); m = if S=='1' then UInt(M:Vm) else UInt(Vm); esize = 16 << UInt(S); if !HaveFP16Ext() && esize == 16 then UNDEFINED; elements = 64 DIV esize; regs = if Q=='0' then 1 else 2; index = if S=='1' then 0 else UInt(M); if InITBlock() then UNPREDICTABLE;

T1
(ARMv8.3)

15141312111098765432101514131211109876543210
11111110SDrotVnVd1000NQM0Vm

64-bit SIMD vector of half-precision floating-point (S == 0 && Q == 0)

VCMLA{<q>}.F16 <Dd>, <Dn>, <Dm>[<index>], #<rotate>

64-bit SIMD vector of single-precision floating-point (S == 1 && Q == 0)

VCMLA{<q>}.F32 <Dd>, <Dn>, <Dm>[0], #<rotate>

128-bit SIMD vector of half-precision floating-point (S == 0 && Q == 1)

VCMLA{<q>}.F16 <Qd>, <Qn>, <Dm>[<index>], #<rotate>

128-bit SIMD vector of single-precision floating-point (S == 1 && Q == 1)

VCMLA{<q>}.F32 <Qd>, <Qn>, <Dm>[0], #<rotate>

if !HaveFJCVTZSExt() then UNDEFINED; if Q == '1' && (Vd<0> == '1' || Vn<0> == '1') then UNDEFINED; d = UInt(D:Vd); n = UInt(N:Vn); m = if S=='1' then UInt(M:Vm) else UInt(Vm); esize = 16 << UInt(S); if !HaveFP16Ext() && esize == 16 then UNDEFINED; elements = 64 DIV esize; regs = if Q=='0' then 1 else 2; index = if S=='1' then 0 else UInt(M); if InITBlock() then UNPREDICTABLE;

Assembler Symbols

<q>

See Standard assembler syntax fields.

<Qd>

Is the 128-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field as <Qd>*2.

<Qn>

Is the 128-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field as <Qn>*2.

<Dd>

Is the 64-bit name of the SIMD&FP destination register, encoded in the "D:Vd" field.

<Dn>

Is the 64-bit name of the first SIMD&FP source register, encoded in the "N:Vn" field.

<Dm>

For the half-precision scalar variant: is the 64-bit name of the second SIMD&FP source register, encoded in the "Vm" field.

For the single-precision scalar variant: is the 64-bit name of the second SIMD&FP source register, encoded in the "M:Vm" field.

<index>

Is the element index in the range 0 to 1, encoded in the "M" field.

<rotate> Is the rotation to be applied to elements in the second SIMD&FP source register, encoded in rot:
rot <rotate>
00 0
01 90
10 180
11 270

Operation

EncodingSpecificOperations(); CheckAdvSIMDEnabled(); for r = 0 to regs-1 operand2 = D[n+r]; operand1 = D[m]; operand3 = D[d+r]; for e = 0 to (elements DIV 2)-1 case rot of when '00' element1 = Elem[operand1,index*2,esize]; element2 = Elem[operand2,e*2,esize]; element3 = Elem[operand1,index*2+1,esize]; element4 = Elem[operand2,e*2,esize]; when '01' element1 = FPNeg(Elem[operand1,index*2+1,esize]); element2 = Elem[operand2,e*2+1,esize]; element3 = Elem[operand1,index*2,esize]; element4 = Elem[operand2,e*2+1,esize]; when '10' element1 = FPNeg(Elem[operand1,index*2,esize]); element2 = Elem[operand2,e*2,esize]; element3 = FPNeg(Elem[operand1,index*2+1,esize]); element4 = Elem[operand2,e*2,esize]; when '11' element1 = Elem[operand1,index*2+1,esize]; element2 = Elem[operand2,e*2+1,esize]; element3 = FPNeg(Elem[operand1,index*2,esize]); element4 = Elem[operand2,e*2+1,esize]; result1 = FPMulAdd(Elem[operand3,e*2,esize],element2,element1, StandardFPSCRValue()); result2 = FPMulAdd(Elem[operand3,e*2+1,esize],element4,element3,StandardFPSCRValue()); Elem[D[d+r],e*2,esize] = result1; Elem[D[d+r],e*2+1,esize] = result2;


Internal version only: isa v00_76, pseudocode v33.1 ; Build timestamp: 2017-09-26T15:10

Copyright © 2010-2017 ARM Limited or its affiliates. All rights reserved. This document is Non-Confidential.