Release history

This page lists the Arm RAN Acceleration Library release history.

To download and install the latest version of Arm RAN Acceleration Library, see our Downloads page and follow the installation steps given in the Reference Manual.

Details on release versions and links to the Release Notes, Documentation, and End User License Agreement (EULA) are provided below.

Arm RAN Acceleration Library

Version 21.10

Released: October 08, 2021

  • Arm RAN Acceleration Library: 21.10 October 08, 2021

    What's new in 21.10

    New features and enhancements:

    • To build Arm RAN Acceleration Library, you must now have CMake version 3.3.0 or higher installed.

    • You can now build a variant of Arm RAN Acceleration Library that is optimized for Scalable Vector Extension (SVE) code. SVE is now a supported argument for the -DARMRAL_ARCH= CMake option which enables optimizations for the AArch64-specific SVE architecture features. The SVE implementation does not require SVE2 features to be available on your target system. You can continue to use -DARMRAL_ARCH=SVE2 to enable all SVE and SVE2 optimizations.

    • Added SVE2 implementations of:

      • 2x2 and 4x4 complex floating-point matrix multiplication functions: armral_cmplx_mat_mult_2x2_f32, armral_cmplx_mat_mult_4x4_f32, and armral_cmplx_mat_mult_f32

      • Matrix inversion

      • 8-bit Mu-Law compression and decompression routines

      • 12-bit block-float compression and decompression routines

      • 14-bit block-float compression and decompression routines

      • Modulation and demodulation routines

    • Added Neon implementations of the following compression and decompression routines:

      • 9-bit Mu-Law compression and decompression

      • 14-bit Mu-Law compression and decompression

      • 8-bit block-scaling compression and decompression

      • 9-bit block-scaling compression and decompression

      • 14-bit block-scaling compression and decompression

    • Added two new functions for polar coding subchannel allocation: armral_polar_subchannel_interleave, and the inverse, armral_polar_subchannel_deinterleave. The functions operate as specified in section of 3GPP Technical Specification (TS) 38.212.

    • Added implementations of the 6-bit, 11-bit, and 16-bit Cyclic Redundancy Check (CRC) polynomials described in section 5.1 of 3GPP Technical Specification (TS) 38.212.

    • Added an implementation of Low-Density Parity Check (LDPC) encoding for:

      • A single block of data

      • A single code block

      The implementation uses a layered min-sum algorithm, and is not yet a full implementation; performing Hybrid Automatic Repeat Request (HARQ), as required, to decode rate matched data, is not yet supported.

    • You can now run the tests and benchmarking under semihosting, using -DARMRAL_SEMIHOSTING=On. -DARMRAL_SEMIHOSTING is described in more detail in the file.

    • Added more examples that show how to use Arm RAN Acceleration Library:

      • An example that uses the Polar coding API, polar_example.cpp.

      • An example that uses the Modulation and Demodulation API, modulation_example.c.

      The examples can be found in the examples/ directory, and are described in
      the file.

    • Improved the performance of the following implementations:

      • 14-bit block-float compression and decompression routines in Neon

      • 16QAM and 64QAM demodulation routines in Neon

      • Gold Sequence generation (armral_seq_generator) in Neon

      • Vector-vector multiplication (armral_cmplx_vecmul_f32_2) in SVE

      • Vector-vector dot product armral_cmplx_vecdot_f32) in SVE

    • The block-float compression routines now calculate the exponent exactly according to O-RAN.WG4.CUS.0-v05.00 Technical Specification (TS) section A.1.1. Previously, the exponent was calculated based on the absolute value. In cases where the number of redundant sign bits differed between the largest negative value and its absolute value, the calculation could lead to one less significant bit being stored in the compressed representation.

    • Updated the modulation API to take 8-bit unsigned integers as input (previously 8-bit signed integers), for compatibility with other routines.

    • Simplified the armral_polar_decoder function. The function now takes the frozen bits mask explicitly, rather than computing the frozen bits mask inside the implementation. To calculate the frozen bits mask, you must first call the function armral_polar_frozen_mask.

    • The armral_polar_frozen_mask function now takes the encoded message size E and the number of parity bits n_pc, in addition to polar code size N and information bits K. Bits that are not sent because of rate-matching (cases where E < N) are now set to be frozen to comply with section 5.4.1 of 3GPP Technical Specification (TS) 38.212.

    • The polar coding functions armral_polar_encoder and armral_polar_decoder) now take pointers to uint8_t rather than pointers to uint32_t, for compatibility with other library functions like modulation and demodulation.

      The bit order of polar-encoded values has been adjusted to match the above interface change and to match other library functions.

    • Updated the demodulation API to output negated Log-Likelihood Ratios (LLRs) to comply with the correct definition of routines.

    • In modulation, where the last symbol would contain incomplete data, now returns an error. Previously, data might silently be lost.

    • You can now build Arm RAN Acceleration Library in directories that have spaces or non-Latin characters in the directory name.

    Resolved issues:

    There are no resolved issues to report in this release.

    Open issues:

    There are no open technical issues at the time of this release.

    • Release Note
    • EULA
    • Documentation
  • Arm RAN Acceleration Library: 21.07 July 09, 2021

    What's new in 21.07

    New features and enhancements:

    • Arm RAN Acceleration Library is now also tested with GCC 11.1.0, in addition to versions 7.5.0, 8.2.0, 9.3.0, and 10.2.0.

    • Arm RAN Acceleration Library is now tested with Clang 12.0.1, instead of version 9.

    • Running 'make bench' now requires a recent version of Python 3 to be installed. Arm RAN Acceleration Library has been tested with Python 3.8.5.

    • The order of struct elements in armral_compressed_data_*has been changed (swapped) to match the format described in O-RAN.WG4.CUS.0-v05.00, Annex D.

    • Added an example which shows how to use the 9-bit block-float compression and decompression interface.

    • Added SVE2 implementations of:

      • 8-bit block-float compression and decompression routines

      • 9-bit block-float compression and decompression routines

      • 2x2 and 4x4 complex-float matrix multiplication routines

      • 2x2, 4x4, 8x8, and 16x16 Hermitian matrix inversion routines

      • 2x2 batched Hermitian matrix inversion

    • Improved the performance of the 12-bit block-float compression and decompression routines in Neon.

    • Updated Mu-Law to use the same interface as block-float compression and decompression:

      • Mu-Law can now produce or consume more than one resource block at a time, instead of a single block at a time. Mu-Law compression now takes an array of armral_cmplx_int16_t as input and outputs an array of armral_compressed_data_8bit. Mu-Law decompression now takes an array of armral_compressed_data_8bit as input and outputs an array of armral_cmplx_int16_t.

      • The armral_mu_law_compression function has been renamed armral_mu_law_compr_8bit to match block-float compression.

      • The armral_mu_law_expansion function has been renamed armral_mu_law_decompr_8bit to match block-float decompression.

    Resolved issues:

    There are no resolved issues to report in this release.

    Open issues:

    There are no open technical issues at the time of this release.

    • Release Note
    • EULA
    • Documentation
  • Arm RAN Acceleration Library: 21.04 April 13, 2021

    What's new in 21.04

    New features and enhancements:

    • The library is now tested with GCC 7.5.0, instead of with GCC 7.1.0.

    • Improved the performance of:

      • 9-bit block-float decompression.

      • All modulation kernels, including extra optimizations for QPSK and 256-QAM modulation.

    • Added Successive Cancellation List (SCL) decoding. The polar decoder (armral_polar_decoder) can now use Successive Cancellation List (SCL) decoding in addition to the previously-used Successive Cancellation (SC) method. A new parameter, 'l', has been added to polar decoding, which represents the list size to be used in SCL. ArmRAL supports list sizes (l) of one, two, or four. If a list size of one is specified, ArmRAL uses SC decoding. If a list size of greater than one is specified, l sequences are output. The most likely candidate codeword is output first. Users must allocate a buffer of size 'l*n' bits for the output codewords.

    • Added more benchmark cases for polar decoding.

    • Added support for three new CMake options:

      • '-DARMRAL_ARCH=<arch>'. '-DARMRAL_ARCH=<arch>' allows you to enable optimizations for the AArch64-specific architecture features: Neon ('-DARMRAL_ARCH=NEON') or SVE2 ('-DARMRAL_ARCH=SVE2'). The default is '-DARMRAL_ARCH=NEON'.

      • '-DSTATIC_TESTING=On|Off'. '-DSTATIC_TESTING=On|Off', when 'On', allows you to force the compiler to link the tests statically.

      • '-DARMRAL_TEST_RUNNER=<command>'. '-DARMRAL_TEST_RUNNER=<command>' allows you to specify a command to use as a prefix before each test executable.

      For more information about all the new CMake options, see

    • Added SVE2 optimizations for the 'arm_cmplx_vecmul' and 'arm_cmplx_vecdot' functions.

    • Added Rader's algorithm for executing FFTs of prime length for which no hand-written kernel exists. FFTs of arbitrary length can now be solved. FFTs that use Rader's algorithm are slower than FFTs that do not use Rader's algorithm.

    • Added an 'examples' directory which contains simple programs that demonstrate how to use different functions in the library. To learn how to build and run the examples, see the and files.

    • Added two new 14-bit block-float compression and decompression functions: armral_block_float_compr_14bit and armral_block_float_decompr_14bit.

    • Documentation changes:

      • Documentation builds are now tested with Doxygen 1.8.13. The documentation configuration file,, has been updated to align it with the Doxygen 1.8.13 configuration file template.

      • References to 'Modules' have been updated to instead reference 'Functions'.

      • Brief descriptions have been added for each of the function groups and data structures that Arm RAN Acceleration Library supports.

      • Added a new tutorial, 'Link to Arm RAN Acceleration Library' (see, that describes how to compile and link to Arm RAN Acceleration Library, and how to run the 'fft_cf32_example.c' example.

    Resolved issues:

    There are no resolved issues to report in this release.

    Open issues:

    There are no open technical issues at the time of this release.

    • Release Note
    • EULA
    • Documentation
  • Arm RAN Acceleration Library: 21.01 January 14, 2021

    What's new in 21.01

    New features and enhancements

    • The operation of the "type" parameter in the interface for armral_solve_* functions has changed. Instead of specifying the equalization type using a parameter, you must now specify the number of subcarriers per G matrix. To enable this, the "type" parameter in the interface for armral_solve_* functions has been replaced by "num_sc_per_g". You need to update your code so that you pass four or six to the armral_solve_* function, instead of passing one or two, respectively.

    • armral_solve_* functions now support equalization with a single subcarrier per G matrix. To enable this equalization, pass one as the value to the "num_sc_per_g" parameter in the armral_solve_* functions. Note that this type of equalization is not type-1 equalization; type-1 equalization solves four subcarriers per G matrix.

    • Benchmarking now prints all the performance data, in addition to the median value. Previously, only the median value was printed. You can use the additional data for performing further statistical analyses.

    • Benchmarking now prints the MD5 checksum of the binary being benchmarked. The checksum can be useful for deciding if code changes are responsible for performance differences, or if the performance differences are due to noise in the benchmark itself.

    • If you attempt to configure using an old version of the GCC or Clang compilers, CMake now emits a warning. The warning is not an error. You can build with unsupported compilers, but the warning indicates that the library might not compile successfully.

    • Improved the CRC24 performance, both in big and little-endian modes.

    • The armral_cmplx_vecmul_i16_2 function now saturates intermediate values to operate the same as the other vector multiply functions.

    • Improved the performance of armral_cmplx_vecmul_i16_2.

    • Added the 'ARMRAL_ENABLE_COVERAGE' option to CMake. For more information about the 'ARMRAL_ENABLE_COVERAGE' option, see the README.

    • Added a new 'make uninstall' target. The new 'make uninstall' target simplifies the uninstall process; any empty directories that were previously created as part of uninstallation are now removed automatically.

    • Improved the performance of 9-bit and 12-bit block-float compression.

    • Improved the performance of 9-bit block-float decompression.

    • Improved the performance of the Pearson correlation coefficient calculation.

    • Improved the performance of equalization (armral_solve_*) functions for type-2 cases. In type-2 cases, the same equalization matrix G is used for six consecutive input vectors, in other words "num_sc_per_g=6". For more information, see the ArmRAL documentation.

    • Improved the performance of FFTs that take complex Q15 input and produce complex Q15 output.

    • Improved the performance of FFTs that take 32-bit complex float input and produce 32-bit complex float output.

    • The equalization routines (armral_solve_*) no longer accept numbers of samples that are not divisible by 12 because 12 is the size of one resource block.

    • Improved the performance of the 32-bit complex vector dot product functions: armral_cmplx_vecdot_i16_32bit and armral_cmplx_vecdot_i16_2_32bit.

    Resolved issues

    • 'CMAKE_C_FLAGS' and 'CMAKE_CXX_FLAGS' are no longer ignored when passed as environment variables to the initial CMake configure step.

    • Previously, CRC24 benchmarking crashed with an assertion failure when built with 'CMAKE_BUILD_TYPE=Debug' because the benchmarks attempted to pass an invalid length. These invalid cases have been removed.

    • Updated the Pearson correlation coefficient implementation to:

      • Use floating-point instead of fixed-point square root calculations. For large inputs, the fixed-point square root did not produce a correctly rounded result. Now, the implementation uses the floating point square root. To convert the result to the equivalent of a fixed-point calculation, the implementation rounds the result to the nearest integer.
      • Remove redundant bit-shifts which might cause inaccuracies for small coefficients.

    • Benchmarking incorrectly reported the solve_type*_2x2_* and solve_type*_1x4_* results: the solve_type*_1x4_* results were reported as the solve_type*_2x2_* results, and the solve_type*_2x2_* were reported as the solve_type*_1x4_* results. The reporting of the function results has been corrected.

    • Previously, Polar decoding modified global state as part of the operation, which could lead to errors if multiple threads attempted decoding simultaneously. The polar decoding operation no longer modifies global state. The function is now thread-safe.

    • Previously, if the number of subcarriers was not a multiple of 24, type-1 equalization routines (armral_solve_*) read off the end of the input G arrays. The issue does not affect the correctness of the operation, however, the memory accesses responsible for the reads off the end of the arrays have been fixed.

    Open issues

    • There are no open technical issues at the time of this release.
    • Release Note
    • EULA
  • Arm RAN Acceleration Library: 20.10 October 02, 2020

    What's new in 20.10

    New features and enhancements

    • 20.10 is the first release of Arm RAN Acceleration Library.

    Resolved issues

    • There are no resolved issues to report in this release.

    Open issues

    • There are no open technical issues at the time of this release.
    • Release Note
    • EULA