SQDMLAL, SQDMLAL2 (by element)
Signed saturating Doubling Multiply-Add Long (by element). This instruction multiplies each vector element in the lower or upper half of the first source SIMD&FP register by the specified vector element of the second source SIMD&FP register, doubles the results, and accumulates the final results with the vector elements of the destination SIMD&FP register. The destination vector elements are twice as long as the elements that are multiplied.
If overflow occurs with any of the results, those results are saturated. If saturation occurs, the cumulative saturation bit FPSR.QC is set.
The SQDMLAL instruction extracts vector elements from the lower half of the first source register, while the SQDMLAL2 instruction extracts vector elements from the upper half of the first source register.
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state and Exception level, an attempt to execute the instruction might be trapped.
It has encodings from 2 classes:
Scalar
and
Vector
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
0 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | size | L | M | Rm | 0 | 0 | 1 | 1 | H | 0 | Rn | Rd |
| | | | | | | | o2 | | | | | |
integer idxdsize = if H == '1' then 128 else 64;
integer index;
bit Rmhi;
case size of
when '01' index = UInt(H:L:M); Rmhi = '0';
when '10' index = UInt(H:L); Rmhi = M;
otherwise UnallocatedEncoding();
integer d = UInt(Rd);
integer n = UInt(Rn);
integer m = UInt(Rmhi:Rm);
integer esize = 8 << UInt(size);
integer datasize = esize;
integer elements = 1;
integer part = 0;
boolean sub_op = (o2 == '1');
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
0 | Q | 0 | 0 | 1 | 1 | 1 | 1 | size | L | M | Rm | 0 | 0 | 1 | 1 | H | 0 | Rn | Rd |
| | | | | | | | | o2 | | | | | |
integer idxdsize = if H == '1' then 128 else 64;
integer index;
bit Rmhi;
case size of
when '01' index = UInt(H:L:M); Rmhi = '0';
when '10' index = UInt(H:L); Rmhi = M;
otherwise UnallocatedEncoding();
integer d = UInt(Rd);
integer n = UInt(Rn);
integer m = UInt(Rmhi:Rm);
integer esize = 8 << UInt(size);
integer datasize = 64;
integer part = UInt(Q);
integer elements = datasize DIV esize;
boolean sub_op = (o2 == '1');
Assembler Symbols
Operation
CheckFPAdvSIMDEnabled64();
bits(datasize) operand1 = Vpart[n, part];
bits(idxdsize) operand2 = V[m];
bits(2*datasize) operand3 = V[d];
bits(2*datasize) result;
integer element1;
integer element2;
bits(2*esize) product;
integer accum;
boolean sat1;
boolean sat2;
element2 = SInt(Elem[operand2, index, esize]);
for e = 0 to elements-1
element1 = SInt(Elem[operand1, e, esize]);
(product, sat1) = SignedSatQ(2 * element1 * element2, 2*esize);
if sub_op then
accum = SInt(Elem[operand3, e, 2*esize]) - SInt(product);
else
accum = SInt(Elem[operand3, e, 2*esize]) + SInt(product);
(Elem[result, e, 2*esize], sat2) = SignedSatQ(accum, 2*esize);
if sat1 || sat2 then FPSR.QC = '1';
V[d] = result;
Internal version only: isa v25.07, AdvSIMD v23.0, pseudocode v31.3
Copyright © 2010-2017 ARM Limited or its affiliates. All rights reserved.
This document is Non-Confidential.