LD2R
Load single 2-element structure and Replicate to all lanes of two registers. This instruction loads a 2-element structure from memory and replicates the structure to all the lanes of the two SIMD&FP registers.
Depending on the settings in the CPACR_EL1, CPTR_EL2, and CPTR_EL3 registers, and the current Security state and Exception level, an attempt to execute the instruction might be trapped.
It has encodings from 2 classes:
No offset
and
Post-index
No offset
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
0 | Q | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | size | Rn | Rt |
| | | L | R | | opcode | S | | | |
integer t = UInt(Rt);
integer n = UInt(Rn);
integer m = integer UNKNOWN;
boolean wback = FALSE;
boolean tag_checked = wback || n != 31;
Post-index
31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
0 | Q | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 1 | Rm | 1 | 1 | 0 | 0 | size | Rn | Rt |
| | | L | R | | opcode | S | | | |
integer t = UInt(Rt);
integer n = UInt(Rn);
integer m = UInt(Rm);
boolean wback = TRUE;
boolean tag_checked = wback || n != 31;
Assembler Symbols
<Vt> |
Is the name of the first or only SIMD&FP register to be transferred, encoded in the "Rt" field.
|
<T> |
Is an arrangement specifier,
encoded in
size:Q :
size |
Q |
<T> |
00 |
0 |
8B |
00 |
1 |
16B |
01 |
0 |
4H |
01 |
1 |
8H |
10 |
0 |
2S |
10 |
1 |
4S |
11 |
0 |
1D |
11 |
1 |
2D |
|
<Vt2> |
Is the name of the second SIMD&FP register to be transferred, encoded as "Rt" plus 1 modulo 32.
|
<Xn|SP> |
Is the 64-bit name of the general-purpose base register or stack pointer, encoded in the "Rn" field.
|
<imm> |
Is the post-index immediate offset,
encoded in
size :
size |
<imm> |
00 |
#2 |
01 |
#4 |
10 |
#8 |
11 |
#16 |
|
<Xm> |
Is the 64-bit name of the general-purpose post-index register, excluding XZR, encoded in the "Rm" field.
|
Shared Decode
integer scale = UInt(opcode<2:1>);
integer selem = UInt(opcode<0>:R) + 1;
boolean replicate = FALSE;
integer index;
case scale of
when 3
// load and replicate
if L == '0' || S == '1' then UNDEFINED;
scale = UInt(size);
replicate = TRUE;
when 0
index = UInt(Q:S:size); // B[0-15]
when 1
if size<0> == '1' then UNDEFINED;
index = UInt(Q:S:size<1>); // H[0-7]
when 2
if size<1> == '1' then UNDEFINED;
if size<0> == '0' then
index = UInt(Q:S); // S[0-3]
else
if S == '1' then UNDEFINED;
index = UInt(Q); // D[0-1]
scale = 3;
MemOp memop = if L == '1' then MemOp_LOAD else MemOp_STORE;
integer datasize = if Q == '1' then 128 else 64;
integer esize = 8 << scale;
Operation
if HaveMTEExt() then
SetNotTagCheckedInstruction(!tag_checked);
CheckFPAdvSIMDEnabled64();
bits(64) address;
bits(64) offs;
bits(128) rval;
bits(esize) element;
constant integer ebytes = esize DIV 8;
if n == 31 then
CheckSPAlignment();
address = SP[];
else
address = X[n];
offs = Zeros();
if replicate then
// load and replicate to all elements
for s = 0 to selem-1
element = Mem[address+offs, ebytes, AccType_VEC];
// replicate to fill 128- or 64-bit register
V[t] = Replicate(element, datasize DIV esize);
offs = offs + ebytes;
t = (t + 1) MOD 32;
else
// load/store one element per register
for s = 0 to selem-1
rval = V[t];
if memop == MemOp_LOAD then
// insert into one lane of 128-bit register
Elem[rval, index, esize] = Mem[address+offs, ebytes, AccType_VEC];
V[t] = rval;
else // memop == MemOp_STORE
// extract from one lane of 128-bit register
Mem[address+offs, ebytes, AccType_VEC] = Elem[rval, index, esize];
offs = offs + ebytes;
t = (t + 1) MOD 32;
if wback then
if m != 31 then
offs = X[m];
if n == 31 then
SP[] = address + offs;
else
X[n] = address + offs;
Operational information
If PSTATE.DIT is 1, the timing of this instruction is insensitive to the value of the data being loaded or stored.