You copied the Doc URL to your clipboard.

Optimizing for code size or performance

The compiler and associated tools use many techniques for optimizing your code. Some of these techniques improve the performance of your code, while other techniques reduce the size of your code.

Note

This topic includes descriptions of [ALPHA] features. See Support level definitions.

Different optimizations often work against each other. That is, techniques for improving code performance might result in increased code size, and techniques for reducing code size might reduce performance. For example, the compiler can unroll small loops for higher performance, with the disadvantage of increased code size.

The default optimization level is -O0. At -O0, armclang does not perform optimization.

The following armclang options help you optimize for code performance:

-O1 | -O2 | -O3
Specify the level of optimization to be used when compiling source files. A higher number implies a higher level of optimization for performance.
-Ofast
Enables all the optimizations from -O3 along with other aggressive optimizations that might violate strict compliance with language standards.
-Omax
Enables all the optimizations from -Ofast along with Link Time Optimization (LTO).

The following armclang options help you optimize for code size:

-Os
Performs optimizations to reduce the code size at the expense of a possible increase in execution time. This option aims for a balanced code size reduction and fast performance.
-Oz
Optimizes for smaller code size.

For more information on optimization levels, see Selecting optimization levels.

Note

You can also set the optimization level for the linker with the armlink option --lto_level. The optimization levels available for armlink are the same as the armclang optimization levels.
-fshort-enums
Allows the compiler to set the size of an enumeration type to the smallest data type that can hold all enumerator values.
-fshort-wchar
Sets the size of wchar_t to 2 bytes.
-fno-exceptions
C++ only. Disables the generation of code that is needed to support C++ exceptions.
-fno-rtti [ALPHA]
C++ only. Disables the generation of code that is needed to support Run Time Type Information (RTTI) features.

The following armclang option helps you optimize for both code size and code performance:

-flto
Enables Link Time Optimization (LTO), which enables the linker to make additional optimizations across multiple source files. See Optimizing across modules with link time optimization for more information.

Note

If you want to use LTO when invoking armlink separately, you can use the armlink option --lto_level to select the LTO optimization level that matches your optimization goal.

In addition, choices you make during coding can affect optimization. For example:

  • Optimizing loop termination conditions can improve both code size and performance. In particular, loops with counters that decrement to zero usually produce smaller, faster code than loops with incrementing counters.
  • Manually unrolling loops by reducing the number of loop iterations, but increasing the amount of work that is done in each iteration, can improve performance at the expense of code size.
  • Reducing debug information in objects and libraries reduces the size of your image.
  • Using inline functions offers a trade-off between code size and performance.
  • Using intrinsics can improve performance.
Was this page helpful? Yes No