You copied the Doc URL to your clipboard.

ARM Compiler armasm User Guide : A64 SIMD Vector instructions in alphabetical order

SIMD Vector instructions in alphabetical order

summary of the A64 SIMD Vector instructions that are supported.

Table 20-1 Summary of A64 SIMD Vector instructions

Mnemonic Brief description See
ABS (vector) Absolute value ABS (vector)
ADD (vector) Add ADD (vector)
ADDHN, ADDHN2 (vector) Add returning high narrow ADDHN, ADDHN2 (vector)
ADDP (vector) Add pairwise ADDP (vector)
ADDV (vector) Add across vector ADDV (vector)
AND (vector) Bitwise AND AND (vector)
BIC (vector, immediate) Bitwise bit clear (immediate) BIC (vector, immediate)
BIC (vector, register) Bitwise bit clear (register) BIC (vector, register)
BIF (vector) Bitwise insert if false BIF (vector)
BIT (vector) Bitwise insert if true BIT (vector)
BSL (vector) Bitwise select BSL (vector)
CLS (vector) Count leading sign bits CLS (vector)
CLZ (vector) Count leading zero bits CLZ (vector)
CMEQ (vector, register) Compare bitwise equal, setting destination vector element to all ones if the condition holds, else zero CMEQ (vector, register)
CMEQ (vector, zero) Compare bitwise equal to zero, setting destination vector element to all ones if the condition holds, else zero CMEQ (vector, zero)
CMGE (vector, register) Compare signed greater than or equal CMGE (vector, register)
CMGE (vector, zero) Compare signed greater than or equal to zero, setting destination vector element to all ones if the condition holds, else zero CMGE (vector, zero)
CMGT (vector, register) Compare signed greater than, setting destination vector element to all ones if the condition holds, else zero CMGT (vector, register)
CMGT (vector, zero) Compare signed greater than zero, setting destination vector element to all ones if the condition holds, else zero CMGT (vector, zero)
CMHI (vector, register) Compare unsigned higher, setting destination vector element to all ones if the condition holds, else zero CMHI (vector, register)
CMHS (vector, register) Compare unsigned higher or same, setting destination vector element to all ones if the condition holds, else zero CMHS (vector, register)
CMLE (vector, zero) Compare signed less than or equal to zero, setting destination vector element to all ones if the condition holds, else zero CMLE (vector, zero)
CMLT (vector, zero) Compare signed less than zero, setting destination vector element to all ones if the condition holds, else zero CMLT (vector, zero)
CMTST (vector) Compare bitwise test bits nonzero, setting destination vector element to all ones if the condition holds, else zero CMTST (vector)
CNT (vector) Population count per byte CNT (vector)
DUP (vector, element) Duplicate vector element to vector DUP (vector, element)
DUP (vector, general) Duplicate general-purpose register to vector DUP (vector, general)
EOR (vector) Bitwise exclusive OR EOR (vector)
EXT (vector) Extract vector from pair of vectors EXT (vector)
FABD (vector) Floating-point absolute difference FABD (vector)
FABS (vector) Floating-point absolute value FABS (vector)
FACGE (vector) Floating-point absolute compare greater than or equal FACGE (vector)
FACGT (vector) Floating-point absolute compare greater than FACGT (vector)
FADD (vector) Floating-point add FADD (vector)
FADDP (vector) Floating-point add pairwise FADDP (vector)
FCMEQ (vector, register) Floating-point compare equal, setting destination vector element to all ones if the condition holds, else zero FCMEQ (vector, register)
FCMEQ (vector, zero) Floating-point compare equal to zero, setting destination vector element to all ones if the condition holds, else zero FCMEQ (vector, zero)
FCMGE (vector, register) Floating-point compare greater than or equal, setting destination vector element to all ones if the condition holds, else zero FCMGE (vector, register)
FCMGE (vector, zero) Floating-point compare greater than or equal to zero, setting destination vector element to all ones if the condition holds, else zero FCMGE (vector, zero)
FCMGT (vector, register) Floating-point compare greater than, setting destination vector element to all ones if the condition holds, else zero FCMGT (vector, register)
FCMGT (vector, zero) Floating-point compare greater than zero, setting destination vector element to all ones if the condition holds, else zero FCMGT (vector, zero)
FCMLE (vector, zero) Floating-point compare less than or equal to zero, setting destination vector element to all ones if the condition holds, else zero FCMLE (vector, zero)
FCMLT (vector, zero) Floating-point compare less than zero, setting destination vector element to all ones if the condition holds, else zero FCMLT (vector, zero)
FCVTAS (vector) Floating-point convert to signed integer, rounding to nearest with ties to away FCVTAS (vector)
FCVTAU (vector) Floating-point convert to unsigned integer, rounding to nearest with ties to away FCVTAU (vector)
FCVTL, FCVTL2 (vector) Floating-point convert to higher precision long FCVTL, FCVTL2 (vector)
FCVTMS (vector) Floating-point convert to signed integer, rounding toward minus infinity FCVTMS (vector)
FCVTMU (vector) Floating-point convert to unsigned integer, rounding toward minus infinity FCVTMU (vector)
FCVTN, FCVTN2 (vector) Floating-point convert to lower precision narrow FCVTN, FCVTN2 (vector)
FCVTNS (vector) Floating-point convert to signed integer, rounding to nearest with ties to even FCVTNS (vector)
FCVTNU (vector) Floating-point convert to unsigned integer, rounding to nearest with ties to even FCVTNU (vector)
FCVTPS (vector) Floating-point convert to signed integer, rounding toward positive infinity FCVTPS (vector)
FCVTPU (vector) Floating-point convert to unsigned integer, rounding toward positive infinity FCVTPU (vector)
FCVTXN, FCVTXN2 (vector) Floating-point convert to lower precision narrow, rounding to odd FCVTXN, FCVTXN2 (vector)
FCVTZS (vector, fixed-point) Floating-point convert to signed fixed-point, rounding toward zero FCVTZS (vector, fixed-point)
FCVTZS (vector, integer) Floating-point convert to signed integer, rounding toward zero FCVTZS (vector, integer)
FCVTZU (vector, fixed-point) Floating-point convert to unsigned fixed-point, rounding toward zero FCVTZU (vector, fixed-point)
FCVTZU (vector, integer) Floating-point convert to unsigned integer, rounding toward zero FCVTZU (vector, integer)
FDIV (vector) Floating-point divide FDIV (vector)
FMAX (vector) Floating-point maximum FMAX (vector)
FMAXNM (vector) Floating-point maximum number FMAXNM (vector)
FMAXNMP (vector) Floating-point maximum number pairwise FMAXNMP (vector)
FMAXNMV (vector) Floating-point maximum number across vector FMAXNMV (vector)
FMAXP (vector) Floating-point maximum pairwise FMAXP (vector)
FMAXV (vector) Floating-point maximum across vector FMAXV (vector)
FMIN (vector) Floating-point minimum FMIN (vector)
FMINNM (vector) Floating-point minimum number FMINNM (vector)
FMINNMP (vector) Floating-point minimum number pairwise FMINNMP (vector)
FMINNMV (vector) Floating-point minimum number across vector FMINNMV (vector)
FMINP (vector) Floating-point minimum pairwise FMINP (vector)
FMINV (vector) Floating-point minimum across vector FMINV (vector)
FMLA (vector, by element) Floating-point fused multiply-add to accumulator (by element) FMLA (vector, by element)
FMLA (vector) Floating-point fused multiply-add to accumulator FMLA (vector)
FMLS (vector, by element) Floating-point fused multiply-subtract from accumulator (by element) FMLS (vector, by element)
FMLS (vector) Floating-point fused multiply-subtract from accumulator FMLS (vector)
FMOV (vector, immediate) Floating-point move immediate FMOV (vector, immediate)
FMUL (vector, by element) Floating-point multiply (by element) FMUL (vector, by element)
FMUL (vector) Floating-point multiply FMUL (vector)
FMULX (vector, by element) Floating-point multiply extended (by element) FMULX (vector, by element)
FMULX (vector) Floating-point multiply extended FMULX (vector)
FNEG (vector) Floating-point negate FNEG (vector)
FRECPE (vector) Floating-point reciprocal estimate FRECPE (vector)
FRECPS (vector) Floating-point reciprocal step FRECPS (vector)
FRINTA (vector) Floating-point round to integral, to nearest with ties to away FRINTA (vector)
FRINTI (vector) Floating-point round to integral, using current rounding mode FRINTI (vector)
FRINTM (vector) Floating-point round to integral, toward minus infinity FRINTM (vector)
FRINTN (vector) Floating-point round to integral, to nearest with ties to even FRINTN (vector)
FRINTP (vector) Floating-point round to integral, toward positive infinity FRINTP (vector)
FRINTX (vector) Floating-point round to integral exact, using current rounding mode FRINTX (vector)
FRINTZ (vector) Floating-point round to integral, toward zero FRINTZ (vector)
FRSQRTE (vector) Floating-point reciprocal square root estimate FRSQRTE (vector)
FRSQRTS (vector) Floating-point reciprocal square root step FRSQRTS (vector)
FSQRT (vector) Floating-point square root FSQRT (vector)
FSUB (vector) Floating-point subtract FSUB (vector)
INS (vector, element) Insert vector element from another vector element INS (vector, element)
INS (vector, general) Insert vector element from general-purpose register INS (vector, general)
LD1 (vector, multiple structures) Load multiple 1-element structures to one, two, three or four registers LD1 (vector, multiple structures)
LD1 (vector, single structure) Load single 1-element structure to one lane of one register LD1 (vector, single structure)
LD1R (vector) Load single 1-element structure and replicate to all lanes (of one register) LD1R (vector)
LD2 (vector, multiple structures) Load multiple 2-element structures to two registers LD2 (vector, multiple structures)
LD2 (vector, single structure) Load single 2-element structure to one lane of two registers LD2 (vector, single structure)
LD2R (vector) Load single 2-element structure and replicate to all lanes of two registers LD2R (vector)
LD3 (vector, multiple structures) Load multiple 3-element structures to three registers LD3 (vector, multiple structures)
LD3 (vector, single structure) Load single 3-element structure to one lane of three registers) LD3 (vector, single structure)
LD3R (vector) Load single 3-element structure and replicate to all lanes of three registers LD3R (vector)
LD4 (vector, multiple structures) Load multiple 4-element structures to four registers LD4 (vector, multiple structures)
LD4 (vector, single structure) Load single 4-element structure to one lane of four registers LD4 (vector, single structure)
LD4R (vector) Load single 4-element structure and replicate to all lanes of four registers LD4R (vector)
MLA (vector, by element) Multiply-add to accumulator (by element) MLA (vector, by element)
MLA (vector) Multiply-add to accumulator MLA (vector)
MLS (vector, by element) Multiply-subtract from accumulator (by element) MLS (vector, by element)
MLS (vector) Multiply-subtract from accumulator MLS (vector)
MOV (vector, element) Move vector element to another vector element MOV (vector, element)
MOV (vector, from general) Move general-purpose register to a vector element MOV (vector, from general)
MOV (vector) Move vector MOV (vector)
MOV (vector, to general) Move vector element to general-purpose register MOV (vector, to general)
MOVI (vector) Move immediate MOVI (vector)
MUL (vector, by element) Multiply (by element) MUL (vector, by element)
MUL (vector) Multiply MUL (vector)
MVN (vector) Bitwise NOT MVN (vector)
MVNI (vector) Move inverted immediate MVNI (vector)
NEG (vector) Negate NEG (vector)
NOT (vector) Bitwise NOT NOT (vector)
ORN (vector) Bitwise inclusive OR NOT ORN (vector)
ORR (vector, immediate) Bitwise inclusive OR (immediate) ORR (vector, immediate)
ORR (vector, register) Bitwise inclusive OR (register) ORR (vector, register)
PMUL (vector) Polynomial multiply PMUL (vector)
PMULL, PMULL2 (vector) Polynomial multiply long PMULL, PMULL2 (vector)
RADDHN, RADDHN2 (vector) Rounding add returning high narrow RADDHN, RADDHN2 (vector)
RBIT (vector) Reverse bit order RBIT (vector)
REV16 (vector) Reverse elements in 16-bit halfwords REV16 (vector)
REV32 (vector) Reverse elements in 32-bit words REV32 (vector)
REV64 (vector) Reverse elements in 64-bit doublewords REV64 (vector)
RSHRN, RSHRN2 (vector) Rounding shift right narrow (immediate) RSHRN, RSHRN2 (vector)
RSUBHN, RSUBHN2 (vector) Rounding subtract returning high narrow RSUBHN, RSUBHN2 (vector)
SABA (vector) Signed absolute difference and accumulate SABA (vector)
SABAL, SABAL2 (vector) Signed absolute difference and accumulate long SABAL, SABAL2 (vector)
SABD (vector) Signed absolute difference SABD (vector)
SABDL, SABDL2 (vector) Signed absolute difference long SABDL, SABDL2 (vector)
SADALP (vector) Signed add and accumulate long pairwise SADALP (vector)
SADDL, SADDL2 (vector) Signed add long SADDL, SADDL2 (vector)
SADDLP (vector) Signed add long pairwise SADDLP (vector)
SADDLV (vector) Signed add long across vector SADDLV (vector)
SADDW, SADDW2 (vector) Signed add wide SADDW, SADDW2 (vector)
SCVTF (vector, fixed-point) Signed fixed-point convert to floating-point SCVTF (vector, fixed-point)
SCVTF (vector, integer) Signed integer convert to floating-point SCVTF (vector, integer)
SHADD (vector) Signed halving add SHADD (vector)
SHL (vector) Shift left (immediate) SHL (vector)
SHLL, SHLL2 (vector) Shift left long (by element size) SHLL, SHLL2 (vector)
SHRN, SHRN2 (vector) Shift right narrow (immediate) SHRN, SHRN2 (vector)
SHSUB (vector) Signed halving subtract SHSUB (vector)
SLI (vector) Shift left and insert (immediate) SLI (vector)
SMAX (vector) Signed maximum SMAX (vector)
SMAXP (vector) Signed maximum pairwise SMAXP (vector)
SMAXV (vector) Signed maximum across vector SMAXV (vector)
SMIN (vector) Signed minimum SMIN (vector)
SMINP (vector) Signed minimum pairwise SMINP (vector)
SMINV (vector) Signed minimum across vector SMINV (vector)
SMLAL, SMLAL2 (vector, by element) Signed multiply-add long (by element) SMLAL, SMLAL2 (vector, by element)
SMLAL, SMLAL2 (vector) Signed multiply-add long SMLAL, SMLAL2 (vector)
SMLSL, SMLSL2 (vector, by element) Signed multiply-subtract long (by element) SMLSL, SMLSL2 (vector, by element)
SMLSL, SMLSL2 (vector) Signed multiply-subtract long SMLSL, SMLSL2 (vector)
SMOV (vector) Signed move vector element to general-purpose register SMOV (vector)
SMULL, SMULL2 (vector, by element) Signed multiply long (by element) SMULL, SMULL2 (vector, by element)
SMULL, SMULL2 (vector) Signed multiply long SMULL, SMULL2 (vector)
SQABS (vector) Signed saturating absolute value SQABS (vector)
SQADD (vector) Signed saturating add SQADD (vector)
SQDMLAL, SQDMLAL2 (vector, by element) Signed saturating doubling multiply-add long (by element) SQDMLAL, SQDMLAL2 (vector, by element)
SQDMLAL, SQDMLAL2 (vector) Signed saturating doubling multiply-add long SQDMLAL, SQDMLAL2 (vector)
SQDMLSL, SQDMLSL2 (vector, by element) Signed saturating doubling multiply-subtract long (by element) SQDMLSL, SQDMLSL2 (vector, by element)
SQDMLSL, SQDMLSL2 (vector) Signed saturating doubling multiply-subtract long SQDMLSL, SQDMLSL2 (vector)
SQDMULH (vector, by element) Signed saturating doubling multiply returning high half (by element) SQDMULH (vector, by element)
SQDMULH (vector) Signed saturating doubling multiply returning high half SQDMULH (vector)
SQDMULL, SQDMULL2 (vector, by element) Signed saturating doubling multiply long (by element) SQDMULL, SQDMULL2 (vector, by element)
SQDMULL, SQDMULL2 (vector) Signed saturating doubling multiply long SQDMULL, SQDMULL2 (vector)
SQNEG (vector) Signed saturating negate SQNEG (vector)
SQRDMULH (vector, by element) Signed saturating rounding doubling multiply returning high half (by element) SQRDMULH (vector, by element)
SQRDMULH (vector) Signed saturating rounding doubling multiply returning high half SQRDMULH (vector)
SQRSHL (vector) Signed saturating rounding shift left (register) SQRSHL (vector)
SQRSHRN, SQRSHRN2 (vector) Signed saturating rounded shift right narrow (immediate) SQRSHRN, SQRSHRN2 (vector)
SQRSHRUN, SQRSHRUN2 (vector) Signed saturating rounded shift right unsigned narrow (immediate) SQRSHRUN, SQRSHRUN2 (vector)
SQSHL (vector, immediate) Signed saturating shift left (immediate) SQSHL (vector, immediate)
SQSHL (vector, register) Signed saturating shift left (register) SQSHL (vector, register)
SQSHLU (vector) Signed saturating shift left unsigned (immediate) SQSHLU (vector)
SQSHRN, SQSHRN2 (vector) Signed saturating shift right narrow (immediate) SQSHRN, SQSHRN2 (vector)
SQSHRUN, SQSHRUN2 (vector) Signed saturating shift right unsigned narrow (immediate) SQSHRUN, SQSHRUN2 (vector)
SQSUB (vector) Signed saturating subtract SQSUB (vector)
SQXTN, SQXTN2 (vector) Signed saturating extract narrow SQXTN, SQXTN2 (vector)
SQXTUN, SQXTUN2 (vector) Signed saturating extract unsigned narrow SQXTUN, SQXTUN2 (vector)
SRHADD (vector) Signed rounding halving add SRHADD (vector)
SRI (vector) Shift right and insert (immediate) SRI (vector)
SRSHL (vector) Signed rounding shift left (register) SRSHL (vector)
SRSHR (vector) Signed rounding shift right (immediate) SRSHR (vector)
SRSRA (vector) Signed rounding shift right and accumulate (immediate) SRSRA (vector)
SSHL (vector) Signed shift left (register) SSHL (vector)
SSHLL, SSHLL2 (vector) Signed shift left long (immediate) SSHLL, SSHLL2 (vector)
SSHR (vector) Signed shift right (immediate) SSHR (vector)
SSRA (vector) Signed shift right and accumulate (immediate) SSRA (vector)
SSUBL, SSUBL2 (vector) Signed subtract long SSUBL, SSUBL2 (vector)
SSUBW, SSUBW2 (vector) Signed subtract wide SSUBW, SSUBW2 (vector)
ST1 (vector, multiple structures) Store multiple 1-element structures from one, two three or four registers ST1 (vector, multiple structures)
ST1 (vector, single structure) Store single 1-element structure from one lane of one register ST1 (vector, single structure)
ST2 (vector, multiple structures) Store multiple 2-element structures from two registers ST2 (vector, multiple structures)
ST2 (vector, single structure) Store single 2-element structure from one lane of two registers ST2 (vector, single structure)
ST3 (vector, multiple structures) Store multiple 3-element structures from three registers ST3 (vector, multiple structures)
ST3 (vector, single structure) Store single 3-element structure from one lane of three registers ST3 (vector, single structure)
ST4 (vector, multiple structures) Store multiple 4-element structures from four registers ST4 (vector, multiple structures)
ST4 (vector, single structure) Store single 4-element structure from one lane of four registers ST4 (vector, single structure)
SUB (vector) Subtract SUB (vector)
SUBHN, SUBHN2 (vector) Subtract returning high narrow SUBHN, SUBHN2 (vector)
SUQADD (vector) Signed saturating accumulate of unsigned value SUQADD (vector)
SXTL, SXTL2 (vector) Signed extend long SXTL, SXTL2 (vector)
TBL (vector) Table vector lookup TBL (vector)
TBX (vector) Table vector lookup extension TBX (vector)
TRN1 (vector) Transpose vectors (primary) TRN1 (vector)
TRN2 (vector) Transpose vectors (secondary) TRN2 (vector)
UABA (vector) Unsigned absolute difference and accumulate UABA (vector)
UABAL, UABAL2 (vector) Unsigned absolute difference and accumulate long UABAL, UABAL2 (vector)
UABD (vector) Unsigned absolute difference UABD (vector)
UABDL, UABDL2 (vector) Unsigned absolute difference long UABDL, UABDL2 (vector)
UADALP (vector) Unsigned add and accumulate long pairwise UADALP (vector)
UADDL, UADDL2 (vector) Unsigned add long UADDL, UADDL2 (vector)
UADDLP (vector) Unsigned add long pairwise UADDLP (vector)
UADDLV (vector) Unsigned sum long across vector UADDLV (vector)
UADDW, UADDW2 (vector) Unsigned add wide UADDW, UADDW2 (vector)
UCVTF (vector, fixed-point) Unsigned fixed-point convert to floating-point UCVTF (vector, fixed-point)
UCVTF (vector, integer) Unsigned integer convert to floating-point UCVTF (vector, integer)
UHADD (vector) Unsigned halving add UHADD (vector)
UHSUB (vector) Unsigned halving subtract UHSUB (vector)
UMAX (vector) Unsigned maximum UMAX (vector)
UMAXP (vector) Unsigned maximum pairwise UMAXP (vector)
UMAXV (vector) Unsigned maximum across vector UMAXV (vector)
UMIN (vector) Unsigned minimum UMIN (vector)
UMINP (vector) Unsigned minimum pairwise UMINP (vector)
UMINV (vector) Unsigned minimum across vector UMINV (vector)
UMLAL, UMLAL2 (vector, by element) Unsigned multiply-add long (by element) UMLAL, UMLAL2 (vector, by element)
UMLAL, UMLAL2 (vector) Unsigned multiply-add long UMLAL, UMLAL2 (vector)
UMLSL, UMLSL2 (vector, by element) Unsigned multiply-subtract long (by element) UMLSL, UMLSL2 (vector, by element)
UMLSL, UMLSL2 (vector) Unsigned multiply-subtract long UMLSL, UMLSL2 (vector)
UMOV (vector) Unsigned move vector element to general-purpose register UMOV (vector)
UMULL, UMULL2 (vector, by element) Unsigned multiply long (by element) UMULL, UMULL2 (vector, by element)
UMULL, UMULL2 (vector) Unsigned multiply long UMULL, UMULL2 (vector)
UQADD (vector) Unsigned saturating add UQADD (vector)
UQRSHL (vector) Unsigned saturating rounding shift left (register) UQRSHL (vector)
UQRSHRN, UQRSHRN2 (vector) Unsigned saturating rounded shift right narrow (immediate) UQRSHRN, UQRSHRN2 (vector)
UQSHL (vector, immediate) Unsigned saturating shift left (immediate) UQSHL (vector, immediate)
UQSHL (vector, register) Unsigned saturating shift left (register) UQSHL (vector, register)
UQSHRN, UQSHRN2 (vector) Unsigned saturating shift right narrow (immediate) UQSHRN, UQSHRN2 (vector)
UQSUB (vector) Unsigned saturating subtract UQSUB (vector)
UQXTN, UQXTN2 (vector) Unsigned saturating extract narrow UQXTN, UQXTN2 (vector)
URECPE (vector) Unsigned reciprocal estimate URECPE (vector)
URHADD (vector) Unsigned rounding halving add URHADD (vector)
URSHL (vector) Unsigned rounding shift left (register) URSHL (vector)
URSHR (vector) Unsigned rounding shift right (immediate) URSHR (vector)
URSQRTE (vector) Unsigned reciprocal square root estimate URSQRTE (vector)
URSRA (vector) Unsigned rounding shift right and accumulate (immediate) URSRA (vector)
USHL (vector) Unsigned shift left (register) USHL (vector)
USHLL, USHLL2 (vector) Unsigned shift left long (immediate) USHLL, USHLL2 (vector)
USHR (vector) Unsigned shift right (immediate) USHR (vector)
USQADD (vector) Unsigned saturating accumulate of signed value USQADD (vector)
USRA (vector) Unsigned shift right and accumulate (immediate) USRA (vector)
USUBL, USUBL2 (vector) Unsigned subtract long USUBL, USUBL2 (vector)
USUBW, USUBW2 (vector) Unsigned subtract wide USUBW, USUBW2 (vector)
UXTL, UXTL2 (vector) Unsigned extend long UXTL, UXTL2 (vector)
UZP1 (vector) Unzip vectors (primary) UZP1 (vector)
UZP2 (vector) Unzip vectors (secondary) UZP2 (vector)
XTN, XTN2 (vector) Extract narrow XTN, XTN2 (vector)
ZIP1 (vector) Zip vectors (primary) ZIP1 (vector)
ZIP2 (vector) Zip vectors (secondary) ZIP2 (vector)
Was this page helpful? Yes No