Arm Performance Libraries Release History

This page describes the changes between releases of Arm Performance Libraries (standalone version).

To download and install the latest version of Arm Performance Libraries, see our downloads page.

Arm Performance Libraries is also available as part of the Arm Compiler for Linux product. For more information, see the Arm Compiler for Linux page.

Download Free Arm Performance Libraries (Free ArmPL)

Version 24.04

Released: April 04, 2024

  • Download Free Arm Performance Libraries (Free ArmPL): 24.04 April 04, 2024

    What's new in 24.04

    Arm Performance Libraries 24.04 covers the following releases:

    • Arm Performance Libraries 24.04 - Released 4th April 2024

    Release summary

    Arm Performance Libraries 24.04 is is compatible with GCC versions 7 to 13.

    Arm Performance Libraries

    Additions and changes:

    Describes new features or components added, or any technical changes to features or components, in the 24.04.0 release.

     

    - Arm Performance Libraries 24.04.0 includes the interface to the random number generation part of the VSL library developed by Intel(R) and shipped for x86 processors as part of oneMKL.  We are grateful to Intel(R) for having released this interface, along with their documentation, to us under a Creative Commons 4.0 licence, allowing us to develop our own implementation of this functionality for users of Arm-based systems, enabling software portability between architectures.

      We have endeavoured to ensure that the same generators and initializations are used as documented in the oneMKL documentation.  This means that functions that return bit sequences are bitwise reproducible between Arm and x86 systems.  If an integer or floating point answer is requested answers may differ as the precision of various operations is different between the two libraries.

      Note that in this release not all of the random number functions from VSL have been included.  These functions are listed in the documentation as not being currently implemented.  We are intending to fill out this coverage in future releases, and we are very keen to hear from users who find missing functionality that they would like us to prioritize.

    - Arm PL for Linux now supports performance tunings for an extended list of microarchitectures and SoCs, including:
        - Neoverse V2 (NVIDIA Grace and AWS Graviton4).
        - Neoverse N2 (Alibaba Yitian 710 and Microsoft Cobalt-100).
        - Neoverse V1 (AWS Graviton3).
        - Neoverse N1 (AWS Graviton2 and Ampere Altra/Altra Max).
        - AmpereOne.
        - Fujitsu A64FX.

    - Increased performance for:
        - FFT functions, especially Hermitian (c2r/r2c) transforms.
        - Small LAPACK functions when called with many threads.

    - When downloading the standalone Linux version of Arm PL there are now just
      four links to select from:
        - .rpm and .deb based installers for GCC users.
        - .rpm and .deb based installers for NVHPC users.

    - The GCC compatible releases are built with GCC 13 and tested with GCC
        versions 7 to 13.

    - The NVHPC compatible releases are built and tested with NVHPC 24.1.
        - Note that NVHPC has a different ABI to their previous releases for returning complex types from Fortran functions, and is not backwards compatible.

    - The version of Arm PL released as part of ACfL maintains the same set of installers for supported Linux distributions as in previous releases.

    - The Windows version of Arm PL now uses a Windows Installer to guide the user through configuration.

    - Performance improvements in libamath for:
        - sinpi, sinpif, cospi, cospif, atanh and atanhf.

    - The Windows version of Arm PL now includes libamath for the first time.
        - This includes scalar and Neon math.h functions, with Neon functions using the vector ABI described here:
          https://community.arm.com/arm-community-blogs/b/high-performance-computing-blog/posts/using-vector-math-functions-on-arm

    Resolved issues:

    There are no resolved issues to report in the 24.04.0 release.

    Open technical issues:

    There are no open technical issues in 24.04.0 release.

    • Release Note
    • EULA
  • Download Free Arm Performance Libraries (Free ArmPL): 23.10 October 12, 2023

    What's new in 23.10

    Arm Performance Libraries 23.10 covers the following releases:

    • Arm Performance Libraries 23.10 - Released 12th October 2023

    Release summary

    Arm Performance Libraries 23.10 is available for the following versions of GCC:

    • GCC 7.5
    • GCC 8.2
    • GCC 9.3
    • GCC 10.2
    • GCC 11.3
    • GCC 12.2

    Arm Performance Libraries

    Additions and changes:

    Describes new features or components added, or any technical changes to
    features or components, in the 23.10.0 release.

    • The is the first combined release of Arm Performance Libraries for Linux,   macOS and Windows.
    • New build of the library compatible with NVIDIA HPC Compilers 23.3.
    • The Arm Neoverse V2 and Neoverse N2 cores are added as new microarchitecture   targets with specific tunings.
    • Increased performance for:
      • Many cases across BLAS routines.
      • Parallel FFT implementations.
    • New BLAS extension matrix-copy routines added:
      • Out-of-place routines: ?OMATCOPY.
      • In-place routines: ?IMATCOPY
      • See examples and online documentation for details.

    Resolved issues:

    Describes any technical issues that are resolved in the 23.10.0 release.

    • Arm Performance Libraries now detects software availability of SVE on SVE capable hardware.

     

    Describes any technical issues that are resolved in the 23.04.1 release.

    • Integer overflow fixed in armpl_spmat_export* functions in lp64 libraries.

    Open technical issues:

    There are no open technical issues in 23.10 release.

    • Release Note
    • EULA
  • Download Free Arm Performance Libraries (Free ArmPL): 23.04.1 May 19, 2023

    What's new in 23.04.1

    Arm Performance Libraries 23.04.1 covers the following releases:

    • Arm Performance Libraries 23.04.1 - Released 19th May 2023
    • Arm Performance Libraries 23.04.0 - Released 24th March 2023

    Release summary

    Arm Performance Libraries 23.04.1 is available for the following versions of GCC:

    • GCC 7.5
    • GCC 8.2
    • GCC 9.3
    • GCC 10.2
    • GCC 11.3
    • GCC 12.2

    Arm Performance Libraries

    Additions and changes:

    There are no new features or components added, or any technical changes to features or components, in the 23.04.1 release.

    This section describes the new features or components added, or any significant technical changes to features or components, in the 23.04.0 release.

    • New routines for sparse linear algebra, including parallel optimizations:
      • Sparse matrix functionality:
        • Triangular matrix solve: armpl_spsv_exec_*
      • Introduction of a new sparse vector type, armpl_spvect_t. Routines for operations on sparse vectors:
        • Dot product: armpl_spdot*_exec_*
        • AXPBY: armpl_spaxpby_exec_*, armpl_spwaxpby_exec_*
        • Plane rotation: armpl_sprot_exec_*
        • Utilities: armpl_spvec_gather_exec_*, armpl_spvec_scatter_exec_*
      • See examples and online documentation for details.
    • Support for LAPACK version 3.11.0.
    • Increased performance for:
      • Small ?GEMM problems.
      • Large parallel thread counts for all BLAS routines across microarchitectures.
      • FFT functions.

    Resolved issues:

    This section describes any technical issues resolved in the 23.04.1 release.

    • Integer overflow fixed in armpl_spmat_export* functions in lp64 libraries.

    This section describes any technical issues resolved in the 23.04.0 release.

    • pkgconfig files renamed and relocated.

    Open technical issues:

    There are no open technical issues in 23.04.1 release.

    There are no open technical issues in 23.04.0 release.

    • Release Note
    • EULA
  • Download Free Arm Performance Libraries (Free ArmPL): 23.04 March 24, 2023

    What's new in 23.04

    Arm Performance Libraries 23.04 covers the following releases:

    •  Arm Performance Libraries 23.04.0 - Released 24th March 2023

    Release summary

    Arm Performance Libraries 23.04.0 is available for the following versions of GCC:

    • GCC 7.5
    • GCC 8.2
    • GCC 9.3
    • GCC 10.2
    • GCC 11.2
    • GCC 12.2

    Arm Performance Libraries

    Additions and changes:

    This section describes the new features or components added, or any significant technical changes to features or components, in the 23.04.0 release.

    • New routines for sparse linear algebra, including parallel optimizations:
      • Sparse matrix functionality:
        • Triangular matrix solve: armpl_spsv_exec_*
      • Introduction of a new sparse vector type, armpl_spvect_t. Routines for operations on sparse vectors:
        • Dot product: armpl_spdot*_exec_*
        • AXPBY: armpl_spaxpby_exec_*, armpl_spwaxpby_exec_*
        • Plane rotation: armpl_sprot_exec_*
        • Utilities: armpl_spvec_gather_exec_*, armpl_spvec_scatter_exec_*
      • See examples and online documentation for details.
    • Support for LAPACK version 3.11.0.
    • Increased performance for:
      • Small ?GEMM problems.
      • Large parallel thread counts for all BLAS routines across microarchitectures.
      • FFT functions.

    Resolved issues:

    This section describes any technical issues resolved in the 23.04.0 release.

    • pkgconfig files renamed and relocated.

    Open technical issues:

    There are no open technical issues in 23.04.0 release.

     

    • Release Note
    • EULA
    • Documentation
  • Download Free Arm Performance Libraries (Free ArmPL): 22.1 September 23, 2022

    What's new in 22.1

    Arm Performance Libraries 22.1 covers the following releases:

    •  Arm Performance Libraries 22.1.0 - Released 23rd September 2022

    Release summary

    Arm Performance Libraries 22.1.0 is available for the following versions of GCC:

    • GCC 7.5
    • GCC 8.2
    • GCC 9.3
    • GCC 10.2
    • GCC 11.2

    Arm Performance Libraries

    Additions and changes:

    This section describes the new features or components added, or any significant technical changes to features or components, in the 22.1 release.

    • Arm Compiler for Linux suite 22.1
      • No changes

    • Arm C/C++/Fortran Compiler 22.1:
      • In previous releases, Arm Compiler for Linux supported custom extensions to the OpenMP pragma "declare variant". These extensions are deprecated in Arm Compiler for Linux 22.1. Support will be removed from Arm Compiler for Linux in version 23.0. As a result users won't be able to define custom vector equivalents of scalar functions. Arm Compiler for Linux 22.1 issues a warning message when such a construct is encountered. The deprecated features are:
        • The "scalable" extension to the #pragma omp declare variant,
        • Specifying context properties for the context set 'construct' in the context selector 'simd', in certain constructs. See documentation for more details: https://developer.arm.com/documentation/101458/2202/Optimize/Vector- routines-support/How-to-declare-custom-vector-routines-in-Arm-C-C---Compiler
      • Improved the vectorization of loops that include the 'omp parallel for' or 'omp parallel for simd' constructs.

     

    • Arm Performance Libraries 22.1.0:
      • Increased performance for:
        • BLAS level 1 and level 2 routines in serial and parallel.
        • BLAS DGEMM scaling for high numbers of cores.
        • FFT functions.
        • LAPACK SVD routines *GESVD, *GESDD, involving:
          • *ORGQR, *ORMQR, *UNGQR, *UNMQR, *ORGLQ, *ORMLQ, *UNGLQ, *UNMLQ, *BDSQR, *GEBRD, *GEQRF
      • Performance improvements in libamath, for:
        • asinh (scalar), asinhf (scalar & vector)
        • exp, expf (vector)
        • log10, log10f (vector)
        • log1p (scalar), log1pf (scalar & vector)
      • Support for LAPACK version 3.10.1.

    Resolved issues:

    • Arm Compiler for Linux suite 22.1:
      • The Arm Compiler for Linux installer no longer requires a python2 installation.
    • Arm C/C++/Fortran Compiler 22.1:
      • Fixed an internal compiler error in armflang when the move_alloc Fortran intrinsic procedure is called on a field value of a deeply nested derived type.
      • Fixed an internal compiler error in armclang that caused the error "fatal error: error in backend: Cannot select" when compiling code using a bfcvt instruction in an inline assembly block at -O1 and above.
      • Fixed a code generation bug affecting functions that contain SVE state and allocate variable length arrays
      • Fixed an issue where the ACLE feature macros for the BFloat16 extension were not correctly defined on targets where the extension is supported.
      • Fixed a compilation failure when including the arm_sve.h header with the POSIX netdb.h header.
    • Arm Performance Libraries 22.1.0:
      • Bug fixes for cblas_*gemmt and cblas_*axpby (OpenMP) functions.

    Open technical issues:

    • Arm C/C++/Fortran Compiler 22.1:
      • In November 2021, two general source code vulnerabilities in compilers were disclosed - CVE-2021-42574 and CVE-2021-42694. Both exploits use Unicode characters to make source code look different when viewed in a text editor to what is processed by the compiler. This exploit could be used by a malicious programmer to inject malicious code into software that looks non-malicious when reviewed or inspected by programmers responsible for the integrity of said software. Arm Compiler for Linux does not have any mitigation for source code containing these attacks. Arm recommends using static analysis tools to detect such vulnerabilities in source code before compilation. For more information see https://developer.arm.com/documentation/ka005002




    • Release Note
    • EULA
    • Documentation
  • Download Free Arm Performance Libraries (Free ArmPL): 22.0.2 May 25, 2022

    What's new in 22.0.2

    Arm Performance Libraries 22.0 covers the following releases:

    • Arm Performance Libraries 22.0.2 - Released 25th May 2022
    • Arm Performance Libraries 22.0.1 - Released 1st April 2022

    Release summary

    Arm Performance Libraries 22.0.2 is available for the following versions of GCC:

    • GCC 7.5
    • GCC 8.2
    • GCC 9.3
    • GCC 10.2
    • GCC 11.2

    Arm Performance Libraries

    Additions and changes:

    • 22.0.0
      • The freely distributed version of Arm Performance Libraries is now optimized for more microarchitectures, including:

        • Neoverse N1 (such as Amazon Graviton 2 and Ampere Altra)
        • Neoverse V1
        • Neoverse N2
        • Fujitsu A64FX
        • Marvell ThunderX2

      • Improved the performance for:

        • BLAS level 1 routines: SVE optimizations for ?COPY, ?SCAL, ?AXPY
        • BLAS level 2 routines: packed and banded functionality; ?TRMV and ?TRSV for large problems
        • BLAS level 3 routines: ?TRMM and ?TRSM for large problems
        • LAPACK routines: ?EEVD (eigenvalue decomposition) for small problems; ?POTRF for multithreaded cases

      • Added support for I?AMIN BLAS extension routines for all types, finding the location of the first minimum value in a vector.

      • Added support for LAPACK version 3.10.0. In addition, an out of bounds bug in LAPACK ?LARRV routines (CVE-2021-4048) has been patched.

      • Performance improvements in libamath, for:

        • atan, atanf (vector)
        • atan2, atan2f (scalar & vector)
        • cos, cosf (vector)
        • erfc, erfcf (vector)
        • exp, expf (vector)
        • logf (vector)
        • pow (vector)
        • sin, sinf (vector)
        • tanf (vector)

      • When using Arm Performance Libraries built for GCC, C/C++ users do not need to link to libgfortran.

    Resolved issues:

    • 22.0.2
      • There are no resolved issues to report in the 22.0.2 release.

    Open technical issues:

    • 22.0.2
      • There are no open technical issues in 22.0.2 release.




    • Release Note
    • EULA
    • Documentation
  • Download Free Arm Performance Libraries (Free ArmPL): 21.1.0 August 24, 2021

    What's new in 21.1.0

    Arm Performance Libraries 21.1 covers the following releases:

    • Arm Performance Libraries 21.1.0 - Released 24th August 2021

    Release summary

    Arm Performance Libraries 21.1.0 is available for the following versions of GCC:

    • GCC 7.5
    • GCC 8.2
    • GCC 9.3
    • GCC 10.2

    Arm Performance Libraries

    Additions and changes:

    • 21.1.0
      • Improved the performance for:

        • BLAS level 1 and 2 routines: multithreaded performance improvements
        • BLAS level 2 routines: ?GEMV
        • BLAS level 3 routines: ?SYRK, ?SYR2K, ?HERK, ?HER2K; and HGEMM for Neoverse N1
        • Interleave-batch functions: armpl_dgemm_interleave_batch, armpl_dtrmm_interleave_batch, and armpl_dtrsm_interleave_batch

      • Added support for LAPACK version 3.9.1.

      • Added support for symmetric band matrix-vector multiplication BLAS routines: CSBMV and ZSBMV to armpl.h.
        Documentation is available at https://developer.arm.com/documentation/101004/2110/BLAS-Basic-Linear-Algebra-Subprograms/BLAS-level-2.

      • Improved the performance of atan2 in libamath.

    Resolved issues:

    • 21.1.0
      • There are no resolved issues to report in the 21.1.0 release.

    Open technical issues:

    • 21.1.0
      • There are no open technical issues in 21.1.0 release.




    • Release Note
    • EULA
    • Documentation
  • Download Free Arm Performance Libraries (Free ArmPL): 21.0.0 March 30, 2021

    What's new in 21.0.0

    Arm Performance Libraries 21.0 covers the following releases:

    • Arm Performance Libraries 21.0.0 - Released 30th March 2021

    Release summary

    Arm Performance Libraries 21.0 is available for the following versions of GCC:

    • GCC 7.5
    • GCC 8.2
    • GCC 9.3
    • GCC 10.2

    Arm Performance Libraries

    Additions and changes:

    • 21.0.0
      • Arm Performance Libraries now supports all real-to-real transform functions defined in the FFTW3 interface. Previously, the planner functions for these types of transforms returned NULL, indicating that they were unavailable.

      • Added support for ?GEMMT BLAS extension routines for all types, performing matrix-matrix multiplication and updating the lower or upper triangular part
         of C only.

      • Added a new suite of routines that are optimized for large batches of small linear algebra problems. Interfaces for the following real and double precision problems are provided:

        • BLAS: DDOT, DGER, DGEMM, DGEMV, DSCAL, DTRMM, DTRSM, and DTRSV
        • LAPACK: DGEQR (QR), DGETRF (LU), and DPOTRF (Cholesky)
        • DORMQR and DORGQR (for multiplying and generating Q)
        • Utility routines for packing and unpacking matrices to and from the new batched data layout.

        An example and full documentation are provided.

      • Improved performance for:

        • BLAS level 1 routines: ?IAMAX, ?NRM2, ?ASUM, and ?DOT
        • BLAS level 2 routines: ?HBMV, ?SBMV, ?TBMV, ?SYR, and ?SYR2
        • BLAS level 3 routines: ?TRSM and [SD]GEMM
        • LAPACK routines: ?POTRF and ?GETRF (for small problems)
        • General small problems

      • Vector performance improvements in libamath:

        • Neon functions: atan, atanf, erf, erff, exp2, exp2f, exp10, and exp10f
        • SVE functions: atan, erff, cos, cosf, pow, sin, sincos, and sincosf

    Resolved issues:

    • 21.0.0
      • If you attempt to install the library using the `--install-to <installation-location>` option, the installer might generate a warning of the form: 'Installing...find: '<installation-location>/arm-*-compiler-<version>*/lib/clang/*/armpl_links': No such file or directory'. This warning is erroneous and does not impact the installation or function of the library.

      • The prototypes for the following LAPACKE functions were previously missing:

        • lapacke_?geqpf
        • lapacke_?geqpf_work
        • lapacke_?ggsvd
        • lapacke_?ggsvd_work
        • lapacke_?ggsvp
        • lapacke_?ggsvp_work

        The prototypes are now available in armpl.h.

    Open technical issues:

    • 21.0.0
      • There are no open technical issues at the time of this release.






    • Release Note
    • EULA
    • Documentation
  • Download Free Arm Performance Libraries (Free ArmPL): 20.3.0 September 09, 2020

    What's new in 20.3.0

    Arm Performance Libraries 20.3 covers the following releases:

    • Arm Performance Libraries 20.3.0 - Released 9th September, 2020

    Release summary

    Arm Performance Libraries 20.3 is available for the following versions of GCC:

    • GCC 7.1
    • GCC 8.2
    • GCC 9.3

    Arm Performance Libraries

    Additions and changes:

    • 20.3.0
      • As well as in the Arm Compiler for Linux package, the Arm Performance Libraries Reference Guide is now available in HTML format on the Arm Developer website:

        https://developer.arm.com/documentation/101004/latest/
      • Added new BLAS level 2 extension routines, ?GERB. Use the new routines to perform a generalized outer-product with an additional scaling parameter.
        For more information, see the online documentation:

        https://developer.arm.com/documentation/101004/2030/BLAS-Basic-Linear-Algebra-Subprograms/BLAS-level-2

        In this release, there is also improved performance for:

        • BLAS level 1 routines: ?NRM2 and ?ASUM
        • BLAS level 2 routines: ?GER, ?SYR, ?HER, ?SYR2, ?HER2, and ?GBMV
        • LAPACK routine: DGEEV (for small eigenvalue problems)

      • Improved single precision FFT performance.

      • Improved libamath performance for:

        • atan and atanf, in both scalar and Neon vector forms
        • SVE erf

    Resolved issues:

    • 20.3.0
      • Fixed a performance degradation in ?SYMV routines that was introduced in 20.2.0.

    Open technical issues:

    • 20.3.0
      • If you attempt to install the library using the `--install-to <installation-location>` option, the installer might generate a warning of the form: 'Installing...find:  '<installation-location>/arm-*-compiler-20.3*/lib/clang/*/ armpl_links': No such file or directory'. This warning is erroneous and does not impact the installation or function of the library.

      • The uninstall.sh script does not correctly uninstall a library that has been installed to a custom location. Instead, you will need to manually remove it from your filesystem. This limitation applies to non-root installations on rpm-based systems, and any relocated installations on Debian-based systems.


    • Release Note
    • EULA
    • Documentation
  • Download Free Arm Performance Libraries (Free ArmPL): 20.2 - latest update 20.2.1 August 07, 2020

    What's new in 20.2 - latest update 20.2.1

    Arm Performance Libraries 20.2 covers the following releases:

    • Arm Performance Libraries 20.2.1 - Released 7th August, 2020
    • Arm Performance Libraries 20.2 - Released 26th June, 2020

    Release summary

    Arm Performance Libraries 20.2 and 20.2.1 are available for the following versions of GCC:

    • GCC 7.1
    • GCC 8.2
    • GCC 9.3

    Arm Performance Libraries

    Additions and changes:

    • 20.2.1
      • The 20.2.1 release is compatible with all Armv8.0-A cores and later.

    • 20.2
      • Improved BLAS level 2 performance for symmetric matrices.

      • Implemented improvements to FFT performance, including faster planning.

      • Implemented improvements to the SVE versions of libamath functions, namely
        exp, expf, log, logf, sin, sinf, cos, and cosf.

    Resolved issues:

    • 20.2.1
      • There are no resolved issues to report in the 20.2.1 release.

    • 20.2
      • Fixed a bug in the LAPACK *POTRF routines that would cause a crash when
        using multiple threads, and when operating on large matrices.

    Open technical issues:

    There are no issues known at the time of this release.

    • Release Note
    • EULA
    • Documentation