Targeting processors, floating-point units, and NEON

Arm DS-5 Development Studio tutorial for selecting specific processors with Arm Compiler to maximize performance, selecting FPU and enabling NEON.

Introduction

Targeting processors, floating-point units, and NEON

This tutorial assumes you have installed and licensed Arm DS-5 Development Studio. For more information, see Getting Started with Arm DS-5 Development Studio.


Selecting the target processor

The Arm Compiler lets you target either an architecture or a specific processor target when generating code:

  • Specifying an architecture provides the greatest code compatibility. The generated code can run on any processor supporting that architecture.
  • Specifying a particular processor provides optimum performance. The compiler can use processor-specific features such as instruction scheduling to generated optimized code for that specific processor.

The --cpu command-line option lets you specify the name of either an architecture or a specific processor target.

To configure the --cpu option in DS-5:

  1. Select your project in the Project Explorer view.
  2. Select Project > Properties from the main menu to display the Properties dialog box.
  3. Expand C/C++ Build, then Settings in the Properties dialog box.
  4. On the Tool Settings tab, select Arm C Compiler > Code Generation to display the code generation settings.
  5. Enter a value for "Target CPU (--cpu)".
  6. Click OK to save the settings.

The "Target CPU (--cpu)" setting lets you configure the --cpu option.

You can see a list of all supported architecture and processor names by specifying list for the "Target CPU (--cpu)" setting, then building your project. The console (Window > Show View > Console) shows the list of architecture and processor names.

If the compiled program is to run on a specific Arm architecture-based processor, select the target processor. For example, to compile code to run on a Cortex-A9 processor use the "Target CPU (--cpu)" setting Cortex-A9.

Alternatively, if the compiled program is to run on different Arm processors, choose the lowest common denominator architecture appropriate for the application and then specify that architecture in place of the processor name. For example, to compile code for processors supporting the Armv7 architecture use the "Target CPU (--cpu)" setting 7.








Selecting the target FPU

Every --cpu target has an associated implicit Floating-Point Unit (FPU). A full list is available in Processors and their implicit Floating-Point Units (FPUs) in the Arm Compiler armcc User Guide.

However, you can use the --fpu command-line option to override the implicit FPU. For example, the option --cpu=ARM1136JF-S --fpu=softvfp generates code that uses the software floating-point library fplib, even though the choice of processor implies the use of architecture VFPv2.

To configure the --fpu option in DS-5, use the "Target FPU (--fpu)" setting. This setting is in the same location on the Properties dialog box as the "Target CPU (--cpu)" setting discussed above.

You can see a list of all supported FPU architectures by specifying list for the "Target FPU (--fpu)" setting, then building your project. The console (Window > Show View > Console) shows the list of FPU architectures .


Enabling NEON

Arm NEON technology is the implementation of the Advanced SIMD architecture extension. It is a 64 and 128-bit hybrid SIMD technology targeted at advanced media and signal processing applications and embedded processors.

Specific NEON instructions let you use the NEON unit to perform operations in parallel on multiple lanes of data.

There are a number of different methods of creating code that uses NEON instructions:

  • Write assembly language, or use embedded assembly language in C, and use the NEON instructions directly.
  • Write in C or C++ using the NEON intrinsics.
  • Call a library routine that has been optimized to use NEON instructions.
  • Have the compiler use automatic vectorization to optimize loops for NEON.
 

To enable automatic vectorization you must target a processor that has a NEON unit. The required command line options are:

  1. A target --cpu that has NEON capability, for example Cortex-A7, Cortex-A8, Cortex-A9, Cortex-A12, or Cortex-A15.

    To configure this in DS-5, use "Target CPU (--cpu)" on the Arm C Compiler > Code Generation settings in the Properties dialog box.

  2. --vectorize to enable NEON vectorization.

    To configure this in DS-5, use "Vectorization (--vectorize)" on the Arm C Compiler > Code Generation settings in the Properties dialog box.

  1. -O2 (default) or -O3 optimization level.

    To configure this in DS-5, use "Optimization level" on the Arm C Compiler > Optimizations settings in the Properties dialog box.

  2. -Otime to optimize for performance instead of code size.

    To configure this in DS-5, use "Optimize for" on the Arm C Compiler > Optimizations settings in the Properties dialog box.

Note that you may need to enable the NEON unit before you can use NEON instructions.


Further reading