Compiling for Neon with Arm Compiler 6
To enable automatic vectorization you must specify appropriate compiler options to do the following:
- Target a processor that has a Neon capabilities.
- Specify an optimization level that includes auto-vectorization.
In addition, specifying the
-Rpass=loop compiler option displays useful diagnostic information from the compiler about how it optimized particular loops. This information includes vectorization width and interleave count.
-Rpass=loop is a [COMMUNITY] feature of Arm Compiler.
Specifying a Neon-capable target
Neon is required in all standard Armv8-A implementations, so targeting any Armv8-A architecture or processor will allow the generation of Neon code.
If you only want to run code on one particular processor, you can target that specific processor. Performance is optimized for the micro-architectural specifics of that processor. However code is only guaranteed to run on that processor.
If you want your code to run on a wide range of processors, you can target an architecture. Generated code runs on any processor implementation of that target architecture, but performance might be impacted.
To target Armv8‑A AArch64 state:
To target the Cortex‑A53 in AArch32 state:
armclang --target=arm-arm-none-eabi -mcpu=cortex-a53
For the older Armv7 architecture, where Neon was optional, you can use the
-mfpu options to specify that Neon is available.
Specifying an auto-vectorizing optimization level
Arm Compiler 6 provides a wide range of optimization levels, selected with the
||Restricted optimization||Disabled by default.|
||High optimization||Enabled by default.|
||Very high optimization||Enabled by default.|
||Reduce code size, balancing code size against code speed.||Enabled by default.|
||Smallest possible code size||Enabled by default.|
||Optimize for high performance beyond -O3||Enabled by default.|
||Optimize for high performance beyond -Ofast||Enabled by default.|
SeeSelecting optimization options, in the Arm Compiler User Guide and -O, in the Arm Compiler armclang Reference Guide for more details about these options.
Auto-vectorization is enabled by default at optimization level
-O2 and higher. The
-fno-vectorize option lets you disable auto-vectorization.
At optimization level
-O1, auto-vectorization is disabled by default. The
-fvectorize option lets you enable auto-vectorization.
At optimization level
-O0, auto-vectorization is always disabled. If you specify the
-fvectorize option, the compiler ignores it.