Home

Community

Tools, Software and IDEs blog

October 7, 2024

What is new in LLVM 19?

Arm architecture and performance improvement contribution summary for LLVM 19 release.

Reading time 6 minutes

LLVM 19.1.0 was released on September 17th, 2024. Since previous release, Arm contributed close to 1000 commits to enable new features and performance improvements that are summarized below.

To find out more about the previous LLVM release, you can read Part 1: What is new in LLVM 18? and Part 2: What is new in LLVM 18? blog posts.

New architecture and CPU support

By Jonathan Thackray

On the CPU side, this release extends the lineup of Armv9.2-A cores, with support for latest Arm Cortex-A520AE, Cortex-A720AE, Cortex-A725 and Cortex-X925. This release also supports new Armv9.2-A data-center cores, with support for Neoverse-N3, Neoverse-V3 and Neoverse-V3AE.

Missing support for older cores such as Cortex-R52+, Cortex-R82AE and Cortex-A78AE have also been added. The Cortex-R82 definition has been corrected to enable FEAT_FLAGM, FEAT_PERFMON and FEAT_PREDRES architecture extensions.

A total of 15 extra CPU identifiers have been added for completeness, which allows the correct extensions to be automatically enabled when using the -mcpu=native command line option.

Performance improvements

Improvement for SPECrate 2017 Integer Benchmark

By Kiran Chandramohan

We improved the performance of 548.exchange2_r by around 9% for this release. This was achieved by removing some redundant loops and better lowering of Minloc and Maxloc Fortran intrinsics with mask or dim arguments.

^{Spec2017intrate scores diagram.}

Code generation improvements

Loop Idiom recognition improvement for ctlz

By Kiran Chandramohan

The loop idiom recognition code for recognizing and inserting a ctlz intrinsic did not support loops where the loopback control is based on an unsigned less-than condition. We improved this pass to recognize these loops and insert a ctlz intrinsic.

Guarded Control Stack support

By John Brawn

The LLVM toolchain now has support for building objects and executables that are compatible with the Guarded Control Stack (GCS) extension:

The clang option -mbranch-protection=gcs enables GCS support, and -mbranch-protection=standard will also enable GCS support. This will set the GCS GNU property bit on output objects. It doesn't cause any code generation changes, as the code generated by clang is already compatible with GCS.
lld will automatically mark output executables with the GCS GNU property bit if all input objects have it set. The -z gcs option can be used to force the bit to be set, and -z gcs-report will warn on all objects that do not have the bit set.
libunwind can be built with GCS support using the CMake option LIBUNWIND_ENABLE_GCS=ON, or by building libunwind with -mbranch-protection=standard. This causes the GCS stack to be unwound when unwinding a C++ exception.

CMSE security mitigation

By Victor Campos

The code generation for the Cortex-M Security Extensions (CMSE) had a security vulnerability, filed as CVE-2024-0151, which states:

Insufficient argument checking in Secure state Entry functions in software using Cortex-M Security Extensions (CMSE), that has been compiled using toolchains that implement 'Arm v8-M Security Extensions Requirements on Development Tools' prior to version 1.4, allows an attacker to pass values to Secure state that are out of range for types smaller than 32-bits. Out of range values might lead to incorrect operations in secure state.

This attack was made possible because the Procedure Call Standard determines that integral typed arguments smaller than a word are zero- or sign-extended by the caller. The same rule applies for return values in the callees.

An attacker could therefore deliberately pass an argument to a CMSE Secure Entry function (or return a value from a CMSE Nonsecure Call function) whose declared type in C/C++ is smaller than 4 bytes, but whose actual contents span more than the declared type's bit width. Such value might be used to access, for instance, secure memory buffers and cause out-of-bounds accesses, potentially leading to secure information leakage.

To mitigate this vulnerability, LLVM's ARM backend now assumes that the mentioned Procedure Call Standard rules might not be respected: code generation of CMSE function definitions and calls now emits sign or zero extension of values before they are first used.

Pull Request: https://github.com/llvm/llvm-project/pull/89944
Arm Security Bulletin: https://developer.arm.com/Arm%20Security%20Center/Cortex-M%20Security%20Extensions

Tools improvements

TargetParser Refactoring

By Tomas Matheson

Significant work has been done on tidying up the AArch64 TargetParser implementation and improving consistency and testability of CPUs, base architectures and their extensions:

The dependency resolution was made consistent throughout the compiler, improved the handling of FMV extensions, and brought the target attribute in-line with GCC behavior.
The new --print-enabled-extensions command line flag was introduced to print exactly which extensions are enabled for a given -mcpu / -march configuration.

Function Multiversioning

By Alexandros Lamprineas

A lot of improvements were implemented in the LLVM support for function multi-versioning, some highlights are:

Mixing target_version with target_clones attributes is now allowed
Function version declarations are allowed in any order
Lexicographic order of feature names is used when mangling. Duplicated features are removed from mangled names
The resolver is not emitted on use but in the translation unit where the default version definition resides
Runtime checks for features implied by the command line do not get optimized away from the body of the resolver
Alias to IFUNC resolver is no longer emitted
Various bug fixes

Flang

By Kiran Chandramohan

OpenMP

OpenMP reduction was improved by adding support for array, complex and pointer reductions. This also involved generalizing the higher-level passes in MLIR so that they can be run in OpenMP regions as well. Support for reduction clause in the Sections construct was also added. Several fixes were made to improve the functionality and stability of OpenMP features like privatization and support for atomic directives. Several semantic checks were added to ensure better conformance with the OpenMP standard. A temporary sequential lowering was added for the workshare directive. Support was added for the primary option in proc_bind clause.

Directives and Intrinsics

Support for non-standard directives was extended to accept directive inside type definition. The vector always directive (!DIR$ VECTOR ALWAYS) is now supported.

Several commonly used non-standard intrinsics are now supported. This includes SECOND, ACCESS, SIGNAL, SLEEP, DERF, and GETENV.

Command Line options

New command line options were added to:

Ignore warnings (-w), use config files to invoke the compiler with non-default options (–config=)
Generate calls to out of line atomics (-moutline-atomics)
Use different OpenMP runtimes (-fopenmp=)
Use different runtime libraries i.e libgcc_s, compiler-rt (-rtlib)
Change the code model (-mcmodel=)

The handling of the -L option on Windows for user libraries was modified to prefer them over the ones from the build or install directory of LLVM.

Code generation

A pass was added to convert constant arguments to global variables. This helps the LLVM optimizer to specialize called functions with constants as arguments. This optimization benefits benchmarks like 503.bwaves_r in SPEC2017. For portability reasons this pass is switched off by default. The pass can be switched ON with the --enable-constant-argument-globalisation option.

Support was added for Complex16 type parameters and returns on the AArch64 platform.

The main function is currently generated during compilation. It is generated only if the Fortran program statement is present. Previously this was linked in as a library.

By Volodymyr Turanskyy

Article text

Re-use is only permitted for informational and non-commercial or personal use only.