Release history

This page lists the Arm RAN Acceleration Library release history.

To download and install the latest version of Arm RAN Acceleration Library, see our Downloads page and follow the installation steps given in the Reference Manual.

Details on release versions and links to the Release Notes, Documentation, and End User License Agreement (EULA) are provided below.

Arm RAN Acceleration Library

Version 21.07

Released: July 09, 2021

  • Arm RAN Acceleration Library: 21.07 July 09, 2021

    What's new in 21.07

    New features and enhancements:

    • Arm RAN Acceleration Library is now also tested with GCC 11.1.0, in addition to versions 7.5.0, 8.2.0, 9.3.0, and 10.2.0.

    • Arm RAN Acceleration Library is now tested with Clang 12.0.1, instead of version 9.

    • Running 'make bench' now requires a recent version of Python 3 to be installed. Arm RAN Acceleration Library has been tested with Python 3.8.5.

    • The order of struct elements in armral_compressed_data_*has been changed (swapped) to match the format described in O-RAN.WG4.CUS.0-v05.00, Annex D.

    • Added an example which shows how to use the 9-bit block-float compression and decompression interface.

    • Added SVE2 implementations of:

      • 8-bit block-float compression and decompression routines

      • 9-bit block-float compression and decompression routines

      • 2x2 and 4x4 complex-float matrix multiplication routines

      • 2x2, 4x4, 8x8, and 16x16 Hermitian matrix inversion routines

      • 2x2 batched Hermitian matrix inversion

    • Improved the performance of the 12-bit block-float compression and decompression routines in Neon.

    • Updated Mu-Law to use the same interface as block-float compression and decompression:

      • Mu-Law can now produce or consume more than one resource block at a time, instead of a single block at a time. Mu-Law compression now takes an array of armral_cmplx_int16_t as input and outputs an array of armral_compressed_data_8bit. Mu-Law decompression now takes an array of armral_compressed_data_8bit as input and outputs an array of armral_cmplx_int16_t.

      • The armral_mu_law_compression function has been renamed armral_mu_law_compr_8bit to match block-float compression.

      • The armral_mu_law_expansion function has been renamed armral_mu_law_decompr_8bit to match block-float decompression.

    Resolved issues:

    There are no resolved issues to report in this release.

    Open issues:

    There are no open technical issues at the time of this release.

    • Release Note
    • EULA
    • Documentation
  • Arm RAN Acceleration Library: 21.04 April 13, 2021

    What's new in 21.04

    New features and enhancements:

    • The library is now tested with GCC 7.5.0, instead of with GCC 7.1.0.

    • Improved the performance of:

      • 9-bit block-float decompression.

      • All modulation kernels, including extra optimizations for QPSK and 256-QAM modulation.

    • Added Successive Cancellation List (SCL) decoding. The polar decoder (armral_polar_decoder) can now use Successive Cancellation List (SCL) decoding in addition to the previously-used Successive Cancellation (SC) method. A new parameter, 'l', has been added to polar decoding, which represents the list size to be used in SCL. ArmRAL supports list sizes (l) of one, two, or four. If a list size of one is specified, ArmRAL uses SC decoding. If a list size of greater than one is specified, l sequences are output. The most likely candidate codeword is output first. Users must allocate a buffer of size 'l*n' bits for the output codewords.

    • Added more benchmark cases for polar decoding.

    • Added support for three new CMake options:

      • '-DARMRAL_ARCH=<arch>'. '-DARMRAL_ARCH=<arch>' allows you to enable optimizations for the AArch64-specific architecture features: Neon ('-DARMRAL_ARCH=NEON') or SVE2 ('-DARMRAL_ARCH=SVE2'). The default is '-DARMRAL_ARCH=NEON'.

      • '-DSTATIC_TESTING=On|Off'. '-DSTATIC_TESTING=On|Off', when 'On', allows you to force the compiler to link the tests statically.

      • '-DARMRAL_TEST_RUNNER=<command>'. '-DARMRAL_TEST_RUNNER=<command>' allows you to specify a command to use as a prefix before each test executable.

      For more information about all the new CMake options, see README.md.

    • Added SVE2 optimizations for the 'arm_cmplx_vecmul' and 'arm_cmplx_vecdot' functions.

    • Added Rader's algorithm for executing FFTs of prime length for which no hand-written kernel exists. FFTs of arbitrary length can now be solved. FFTs that use Rader's algorithm are slower than FFTs that do not use Rader's algorithm.

    • Added an 'examples' directory which contains simple programs that demonstrate how to use different functions in the library. To learn how to build and run the examples, see the README.md and example.md files.

    • Added two new 14-bit block-float compression and decompression functions: armral_block_float_compr_14bit and armral_block_float_decompr_14bit.

    • Documentation changes:

      • Documentation builds are now tested with Doxygen 1.8.13. The documentation configuration file, Doxyfile.in, has been updated to align it with the Doxygen 1.8.13 configuration file template.

      • References to 'Modules' have been updated to instead reference 'Functions'.

      • Brief descriptions have been added for each of the function groups and data structures that Arm RAN Acceleration Library supports.

      • Added a new tutorial, 'Link to Arm RAN Acceleration Library' (see example.md), that describes how to compile and link to Arm RAN Acceleration Library, and how to run the 'fft_cf32_example.c' example.

    Resolved issues:

    There are no resolved issues to report in this release.

    Open issues:

    There are no open technical issues at the time of this release.

    • Release Note
    • EULA
    • Documentation
  • Arm RAN Acceleration Library: 21.01 January 14, 2021

    What's new in 21.01

    New features and enhancements

    • The operation of the "type" parameter in the interface for armral_solve_* functions has changed. Instead of specifying the equalization type using a parameter, you must now specify the number of subcarriers per G matrix. To enable this, the "type" parameter in the interface for armral_solve_* functions has been replaced by "num_sc_per_g". You need to update your code so that you pass four or six to the armral_solve_* function, instead of passing one or two, respectively.

    • armral_solve_* functions now support equalization with a single subcarrier per G matrix. To enable this equalization, pass one as the value to the "num_sc_per_g" parameter in the armral_solve_* functions. Note that this type of equalization is not type-1 equalization; type-1 equalization solves four subcarriers per G matrix.

    • Benchmarking now prints all the performance data, in addition to the median value. Previously, only the median value was printed. You can use the additional data for performing further statistical analyses.

    • Benchmarking now prints the MD5 checksum of the binary being benchmarked. The checksum can be useful for deciding if code changes are responsible for performance differences, or if the performance differences are due to noise in the benchmark itself.

    • If you attempt to configure using an old version of the GCC or Clang compilers, CMake now emits a warning. The warning is not an error. You can build with unsupported compilers, but the warning indicates that the library might not compile successfully.

    • Improved the CRC24 performance, both in big and little-endian modes.

    • The armral_cmplx_vecmul_i16_2 function now saturates intermediate values to operate the same as the other vector multiply functions.

    • Improved the performance of armral_cmplx_vecmul_i16_2.

    • Added the 'ARMRAL_ENABLE_COVERAGE' option to CMake. For more information about the 'ARMRAL_ENABLE_COVERAGE' option, see the README.

    • Added a new 'make uninstall' target. The new 'make uninstall' target simplifies the uninstall process; any empty directories that were previously created as part of uninstallation are now removed automatically.

    • Improved the performance of 9-bit and 12-bit block-float compression.

    • Improved the performance of 9-bit block-float decompression.

    • Improved the performance of the Pearson correlation coefficient calculation.

    • Improved the performance of equalization (armral_solve_*) functions for type-2 cases. In type-2 cases, the same equalization matrix G is used for six consecutive input vectors, in other words "num_sc_per_g=6". For more information, see the ArmRAL documentation.

    • Improved the performance of FFTs that take complex Q15 input and produce complex Q15 output.

    • Improved the performance of FFTs that take 32-bit complex float input and produce 32-bit complex float output.

    • The equalization routines (armral_solve_*) no longer accept numbers of samples that are not divisible by 12 because 12 is the size of one resource block.

    • Improved the performance of the 32-bit complex vector dot product functions: armral_cmplx_vecdot_i16_32bit and armral_cmplx_vecdot_i16_2_32bit.

    Resolved issues

    • 'CMAKE_C_FLAGS' and 'CMAKE_CXX_FLAGS' are no longer ignored when passed as environment variables to the initial CMake configure step.

    • Previously, CRC24 benchmarking crashed with an assertion failure when built with 'CMAKE_BUILD_TYPE=Debug' because the benchmarks attempted to pass an invalid length. These invalid cases have been removed.

    • Updated the Pearson correlation coefficient implementation to:

      • Use floating-point instead of fixed-point square root calculations. For large inputs, the fixed-point square root did not produce a correctly rounded result. Now, the implementation uses the floating point square root. To convert the result to the equivalent of a fixed-point calculation, the implementation rounds the result to the nearest integer.
      • Remove redundant bit-shifts which might cause inaccuracies for small coefficients.

    • Benchmarking incorrectly reported the solve_type*_2x2_* and solve_type*_1x4_* results: the solve_type*_1x4_* results were reported as the solve_type*_2x2_* results, and the solve_type*_2x2_* were reported as the solve_type*_1x4_* results. The reporting of the function results has been corrected.

    • Previously, Polar decoding modified global state as part of the operation, which could lead to errors if multiple threads attempted decoding simultaneously. The polar decoding operation no longer modifies global state. The function is now thread-safe.

    • Previously, if the number of subcarriers was not a multiple of 24, type-1 equalization routines (armral_solve_*) read off the end of the input G arrays. The issue does not affect the correctness of the operation, however, the memory accesses responsible for the reads off the end of the arrays have been fixed.

    Open issues

    • There are no open technical issues at the time of this release.
    • Release Note
    • EULA
  • Arm RAN Acceleration Library: 20.10 October 02, 2020

    What's new in 20.10

    New features and enhancements

    • 20.10 is the first release of Arm RAN Acceleration Library.

    Resolved issues

    • There are no resolved issues to report in this release.

    Open issues

    • There are no open technical issues at the time of this release.
    • Release Note
    • EULA