What's new in 21.01
New features and enhancements
- The operation of the "type" parameter in the interface for armral_solve_* functions has changed. Instead of specifying the equalization type using a parameter, you must now specify the number of subcarriers per G matrix. To enable this, the "type" parameter in the interface for armral_solve_* functions has been replaced by "num_sc_per_g". You need to update your code so that you pass four or six to the armral_solve_* function, instead of passing one or two, respectively.
- armral_solve_* functions now support equalization with a single subcarrier per G matrix. To enable this equalization, pass one as the value to the "num_sc_per_g" parameter in the armral_solve_* functions. Note that this type of equalization is not type-1 equalization; type-1 equalization solves four subcarriers per G matrix.
- Benchmarking now prints all the performance data, in addition to the median value. Previously, only the median value was printed. You can use the additional data for performing further statistical analyses.
- Benchmarking now prints the MD5 checksum of the binary being benchmarked. The checksum can be useful for deciding if code changes are responsible for performance differences, or if the performance differences are due to noise in the benchmark itself.
- If you attempt to configure using an old version of the GCC or Clang compilers, CMake now emits a warning. The warning is not an error. You can build with unsupported compilers, but the warning indicates that the library might not compile successfully.
- Improved the CRC24 performance, both in big and little-endian modes.
- The armral_cmplx_vecmul_i16_2 function now saturates intermediate values to operate the same as the other vector multiply functions.
- Improved the performance of armral_cmplx_vecmul_i16_2.
- Added the 'ARMRAL_ENABLE_COVERAGE' option to CMake. For more information about the 'ARMRAL_ENABLE_COVERAGE' option, see the README.
- Added a new 'make uninstall' target. The new 'make uninstall' target simplifies the uninstall process; any empty directories that were previously created as part of uninstallation are now removed automatically.
- Improved the performance of 9-bit and 12-bit block-float compression.
- Improved the performance of 9-bit block-float decompression.
- Improved the performance of the Pearson correlation coefficient calculation.
- Improved the performance of equalization (armral_solve_*) functions for type-2 cases. In type-2 cases, the same equalization matrix G is used for six consecutive input vectors, in other words "num_sc_per_g=6". For more information, see the ArmRAL documentation.
- Improved the performance of FFTs that take complex Q15 input and produce complex Q15 output.
- Improved the performance of FFTs that take 32-bit complex float input and produce 32-bit complex float output.
- The equalization routines (armral_solve_*) no longer accept numbers of samples that are not divisible by 12 because 12 is the size of one resource block.
- Improved the performance of the 32-bit complex vector dot product functions: armral_cmplx_vecdot_i16_32bit and armral_cmplx_vecdot_i16_2_32bit.
Resolved issues
- 'CMAKE_C_FLAGS' and 'CMAKE_CXX_FLAGS' are no longer ignored when passed as environment variables to the initial CMake configure step.
- Previously, CRC24 benchmarking crashed with an assertion failure when built with 'CMAKE_BUILD_TYPE=Debug' because the benchmarks attempted to pass an invalid length. These invalid cases have been removed.
- Updated the Pearson correlation coefficient implementation to:
- Use floating-point instead of fixed-point square root calculations. For large inputs, the fixed-point square root did not produce a correctly rounded result. Now, the implementation uses the floating point square root. To convert the result to the equivalent of a fixed-point calculation, the implementation rounds the result to the nearest integer.
- Remove redundant bit-shifts which might cause inaccuracies for small coefficients.
- Benchmarking incorrectly reported the solve_type*_2x2_* and solve_type*_1x4_* results: the solve_type*_1x4_* results were reported as the solve_type*_2x2_* results, and the solve_type*_2x2_* were reported as the solve_type*_1x4_* results. The reporting of the function results has been corrected.
- Previously, Polar decoding modified global state as part of the operation, which could lead to errors if multiple threads attempted decoding simultaneously. The polar decoding operation no longer modifies global state. The function is now thread-safe.
- Previously, if the number of subcarriers was not a multiple of 24, type-1 equalization routines (armral_solve_*) read off the end of the input G arrays. The issue does not affect the correctness of the operation, however, the memory accesses responsible for the reads off the end of the arrays have been fixed.
Open issues
- There are no open technical issues at the time of this release.
Release Note for Release history 21.01
Arm RAN Acceleration Library 21.01 Release Note
===============================================
Non-Confidential
Copyright © 2020-2021 Arm Limited (or its affiliates). All rights reserved.
Non-Confidential Proprietary Notice
===================================
This document is protected by copyright and other related rights and the
practice or implementation of the information contained in this document may be
protected by one or more patents or pending patent applications. No part of this
document may be reproduced in any form by any means without the express prior
written permission of Arm. No license, express or implied, by estoppel or
otherwise to any intellectual property rights is granted by this document
unless specifically stated.
Your access to the information in this document is conditional upon your
acceptance that you will not use or permit others to use the information for
the purposes of determining whether implementations infringe any third party
patents.
THIS DOCUMENT IS PROVIDED “AS IS”. ARM PROVIDES NO REPRESENTATIONS AND NO
WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, INCLUDING, WITHOUT LIMITATION, THE
IMPLIED WARRANTIES OF MERCHANTABILITY, SATISFACTORY QUALITY, NON-INFRINGEMENT
OR FITNESS FOR A PARTICULAR PURPOSE WITH RESPECT TO THE DOCUMENT. For the
avoidance of doubt, Arm makes no representation with respect to, and has
undertaken no analysis to identify or understand the scope and content of,
patents, copyrights, trade secrets, or other rights.
This document may include technical inaccuracies or typographical errors.
TO THE EXTENT NOT PROHIBITED BY LAW, IN NO EVENT WILL ARM BE LIABLE FOR ANY
DAMAGES, INCLUDING WITHOUT LIMITATION ANY DIRECT, INDIRECT, SPECIAL,
INCIDENTAL, PUNITIVE, OR CONSEQUENTIAL DAMAGES, HOWEVER CAUSED AND REGARDLESS
OF THE THEORY OF LIABILITY, ARISING OUT OF ANY USE OF THIS DOCUMENT, EVEN IF
ARM HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
This document consists solely of commercial items. You shall be responsible for
ensuring that any use, duplication or disclosure of this document complies fully
with any relevant export laws and regulations to assure that this document or
any portion thereof is not exported, directly or indirectly, in violation of
such export laws. Use of the word “partner” in reference to Arm’s customers is
not intended to create or refer to any partnership relationship with any other
company. Arm may make changes to this document at any time and without notice.
If any of the provisions contained in these terms conflict with any of the
provisions of any click through or signed written agreement covering this
document with Arm, then the click through or signed written agreement prevails
over and supersedes the conflicting provisions of these terms. This document may
be translated into other languages for convenience, and you agree that if there
is any conflict between the English version of this document and any
translation, the terms of the English version of the Agreement shall prevail.
The Arm corporate logo and words marked with ® or ™ are registered trademarks or
trademarks of Arm Limited (or its affiliates) in the US and/or elsewhere. All
rights reserved. Other brands and names mentioned in this document may be the
trademarks of their respective owners. Please follow Arm’s trademark usage
guidelines at http://www.arm.com/company/policies/trademarks.
Copyright © 2020-2021 Arm Limited (or its affiliates). All rights reserved.
Arm Limited. Company 02557590 registered in England.
110 Fulbourn Road, Cambridge, England CB1 9NJ.
(LES-PRE-20349)
Confidentiality Status
----------------------
This document is Non-Confidential. The right to use, copy and disclose this
document may be subject to license restrictions in accordance with the terms
of the agreement entered into by Arm and the party that Arm delivered this
document to.
Unrestricted Access is an Arm internal classification.
Product Status
--------------
The information in this document is Final, that is for a developed product.
Web Address
-----------
https://developer.arm.com
Progressive terminology commitment
----------------------------------
Arm values inclusive communities. Arm recognizes that we and our industry have
used terms that can be offensive. Arm strives to lead the industry and create
change.
We believe that this document contains no offensive terms.
If you find offensive terms in this document, please contact terms@arm.com.
Contents
========
- Conventions
- Release overview
- Release contents
- Get started
- Support
- Release History
Conventions
===========
The following subsections describe conventions used in Arm documents.
Glossary
--------
The Arm Glossary is a list of terms that are used in Arm documentation, together
with definitions for those terms. The Arm Glossary does not contain terms that
are industry standard unless the Arm meaning differs from the generally accepted
meaning.
See the Arm Glossary for more information: https://developer.arm.com/glossary.
Release overview
================
Use of Arm RAN Acceleration Library is subject to the terms and conditions of
the applicable End User License Agreement (“EULA”). A copy of the EULA can be
found in the 'license_terms' folder of your product installation.
Product description
-------------------
The Arm RAN Acceleration Library (ArmRAL) contains a set of functions for
accelerating telecommunications applications such as, but not limited to, 5G
Radio Access Networks (RANs).
The Arm RAN Acceleration Library 21.01 package provides a
library that is optimized for Armv8-A AArch64-based processors.
The library provides:
- Vector functions
- Matrix functions
- Lower PHY support functions
- Upper PHY support functions
- DU-RU Interface support functions
The library includes functions that operate on 16-bit signed integers and 32-bit
floating-point values.
Release Status
--------------
This is the 21.01 release of Arm RAN Acceleration Library.
These deliverables are being released under the terms of the agreement between
Arm and each licensee (the "Agreement"). All planned verification and
validation is complete.
The release is suitable for volume production under the terms of the Agreement.
Release contents
================
The following sub-sections describe:
- The product parts that are delivered as part of this release.
- Any changes since the previous release.
- Any known issues and limitations that exist at the time of this release.
Deliverables
------------
- Arm RAN Acceleration Library 21.01
- Release Notes (this document)
- Documentation (product documentation is available on the Arm developer
website at:
https://developer.arm.com/solutions/infrastructure/developer-resources/5g/ran)
Documentation may change between product releases. For the latest
documentation bundle, check the delivery platform.
Arm tests its PDFs only in Adobe Acrobat and Acrobat Reader. Arm cannot
guarantee the quality this document when used with any other PDF reader.
A suitable file reader can be downloaded from Adobe at http://www.adobe.com
Differences from previous release
---------------------------------
The following subsections describe differences from the previous release of
Arm RAN Acceleration Library.
Additions and changes:
~~~~~~~~~~~~~~~~~~~~~~
Describes new features or components added, or any technical changes to
features or components, in this release.
- The operation of the "type" parameter in the interface for armral_solve_*
functions has changed. Instead of specifying the equalization type using a
parameter, you must now specify the number of subcarriers per G matrix. To
enable this, the "type" parameter in the interface for armral_solve_*
functions has been replaced by "num_sc_per_g". You need to update your code
so that you pass four or six to the armral_solve_* function, instead of
passing one or two, respectively.
- armral_solve_* functions now support equalization with a single subcarrier
per G matrix. To enable this equalization, pass one as the value to the
"num_sc_per_g" parameter in the armral_solve_* functions. Note that this
type of equalization is not type-1 equalization; type-1 equalization solves
four subcarriers per G matrix.
- Benchmarking now prints all the performance data, in addition to the median
value. Previously, only the median value was printed. You can use the
additional data for performing further statistical analyses.
- Benchmarking now prints the MD5 checksum of the binary being benchmarked. The
checksum can be useful for deciding if code changes are responsible for
performance differences, or if the performance differences are due to noise in
the benchmark itself.
- If you attempt to configure using an old version of the GCC or Clang
compilers, CMake now emits a warning. The warning is not an error. You can
build with unsupported compilers, but the warning indicates that the library
might not compile successfully.
- Improved the CRC24 performance, both in big and little-endian modes.
- The armral_cmplx_vecmul_i16_2 function now saturates intermediate values to
operate the same as the other vector multiply functions.
- Improved the performance of armral_cmplx_vecmul_i16_2.
- Added the 'ARMRAL_ENABLE_COVERAGE' option to CMake. For more information about
the 'ARMRAL_ENABLE_COVERAGE' option, see the README.
- Added a new 'make uninstall' target. The new 'make uninstall' target
simplifies the uninstall process; any empty directories that were previously
created as part of uninstallation are now removed automatically.
- Improved the performance of 9-bit and 12-bit block-float compression.
- Improved the performance of 9-bit block-float decompression.
- Improved the performance of the Pearson correlation coefficient calculation.
- Improved the performance of equalization (armral_solve_*) functions for type-2
cases. In type-2 cases, the same equalization matrix G is used for six
consecutive input vectors, in other words "num_sc_per_g=6". For more
information, see the ArmRAL documentation.
- Improved the performance of FFTs that take complex Q15 input and produce
complex Q15 output.
- Improved the performance of FFTs that take 32-bit complex float input and
produce 32-bit complex float output.
- The equalization routines (armral_solve_*) no longer accept numbers of
samples that are not divisible by 12 because 12 is the size of one
resource block.
- Improved the performance of the 32-bit complex vector dot product functions:
armral_cmplx_vecdot_i16_32bit and armral_cmplx_vecdot_i16_2_32bit.
Resolved issues:
~~~~~~~~~~~~~~~~
- 'CMAKE_C_FLAGS' and 'CMAKE_CXX_FLAGS' are no longer ignored when passed as
environment variables to the initial CMake configure step.
- Previously, CRC24 benchmarking crashed with an assertion failure when built
with 'CMAKE_BUILD_TYPE=Debug' because the benchmarks attempted to pass an
invalid length. These invalid cases have been removed.
- Updated the Pearson correlation coefficient implementation to:
* Use floating-point instead of fixed-point square root calculations. For
large inputs, the fixed-point square root did not produce a
correctly rounded result. Now, the implementation uses the floating point
square root. To convert the result to the equivalent of a fixed-point
calculation, the implementation rounds the result to the nearest integer.
* Remove redundant bit-shifts which might cause inaccuracies for small
coefficients.
- Benchmarking incorrectly reported the solve_type*_2x2_* and solve_type*_1x4_*
results: the solve_type*_1x4_* results were reported as the solve_type*_2x2_*
results, and the solve_type*_2x2_* were reported as the solve_type*_1x4_*
results. The reporting of the function results has been corrected.
- Previously, Polar decoding modified global state as part of the operation,
which could lead to errors if multiple threads attempted decoding
simultaneously. The polar decoding operation no longer modifies global state.
The function is now thread-safe.
- Previously, if the number of subcarriers was not a multiple of 24, type-1
equalization routines (armral_solve_*) read off the end of the input G arrays.
The issue does not affect the correctness of the operation, however, the
memory accesses responsible for the reads off the end of the arrays have been
fixed.
Known limitations
-----------------
There are no open technical issues at the time of this release.
Get started
===========
This section describes information to help you get started with accessing,
setting up, and using Arm RAN Acceleration Library.
Licensing information
---------------------
Use of Arm RAN Acceleration Library is subject to the terms and conditions of
the applicable End User License Agreement (“EULA”). A copy of the EULA can be
found in the 'license_terms' folder of your product installation.
You do not require a license to use this Arm RAN Acceleration Library package.
Prerequisites
-------------
The library runs on AArch64 cores, however to use the CRC functions, you
must run on a core that supports the AArch64 PMULL extension. If your machine
supports the PMULL extension, pmull is listed under the "Features" list given in
the /proc/cpuinfo file.
If any of the following tools are not already installed on your system, you
must install them before building or running Arm RAN Acceleration Library:
* A recent version of a C/C++ compiler, such as GCC. The library has been tested
with GCC 7.1.0, 8.2.0, 9.3.0, and 10.2.0.
* A recent version of CMake (version 3.0.0, or higher).
In addition to the preceding requirements:
* To run the benchmarks, you must have the Linux utility tool 'perf' installed.
* To build a local version of the documentation, you must have Doxygen
installed.
Download Arm RAN Acceleration Library
-------------------------------------
Arm delivers the Arm RAN Acceleration Library files through the Arm developer
website:
https://developer.arm.com/solutions/infrastructure/developer-resources/5g/ran
Unpack the product
------------------
The following steps describe how to unpack each constituent part delivered in
this bundle:
1. Relocate the bundle file. Move the .tar.gz file to the directory you want to
build the product in.
2. Extract the .tar.gz file contents using a tar utility:
tar zxvf arm-ran-acceleration-library-21.01-aarch64.tar.gz
Compile the product
-------------------
To build the library, navigate to the unpacked product directory and use the
following commands:
mkdir <build>
cd <build>
cmake [options] <path>
make
Substitute:
* '<build>' with a directory name to build the library in
* '[options]' with the CMake options to use to build the library
* '<path>' with the path to the root directory of the library source.
For a list of the common CMake options that are supported, see the instructions
in the README.md file. You can find README.md in the unzipped package directory.
Directory structure:
--------------------
Shows the principal directory structure of this release created after unpacking
the bundle:
license_terms/
docs/
src/
include/
test/
Doxyfile.in
README.md
RELEASE_NOTES.txt
Install the product
-------------------
1. If you have not already built the library, or if you want to rebuild the
library to specify a custom install location, navigate to the unpacked
product directory and run CMake:
mkdir <build>
cd <build>
cmake [options] -DCMAKE_INSTALL_PREFIX=<install-dir> <path>
make
Substitute:
* '<build>' with a build directory name. The library is built in the
specified directory.
* '[options]' with the CMake options to use to build the library.
* (Optional) '<install-dir>' with an installation directory name. The library
installs to the specified directory.
* '<path>' with the path to the root directory of the library source.
Note: You must have write access for the installation directories:
* For a default installation, you must have write access for
'/usr/local/lib/', for the library, and '/usr/local/include/', for the
header files.
* For a custom installation, you must have write access for
'<install-dir>/lib/', for the library, and '<install-dir>/include/', for the
header files.
2. Install the library, run:
make install
An install creates an 'install_manifest.txt' file in the library build
directory. 'install_manifest.txt' lists the installation locations for the
library and the header files.
Use Arm RAN Acceleration Library
--------------------------------
To use the Arm RAN Acceleration Library functions, include the armral.h header
file in your C or C++ source code.
For more information, see the README.md file or the documentation on the
Arm developer website:
https://developer.arm.com/documentation/102249/2101
Tests and benchmarks
--------------------
Tests
~~~~~
Note: To run the included library tests, you must have built the library with
the '-DBUILD_TESTING=On' CMake option.
To run the included tests, use:
make check
The tests check that the optimized library implementation of a function matches
the results of a reference implementation. Test times vary from system to
system, but typically only take a few seconds.
Benchmarks
~~~~~~~~~~
Note: To run the included benchmark tests, you must have built the library with
the '-DBUILD_TESTING=On' CMake option.
To run the benchmarks, use:
make bench
Benchmark results are printed as JSON objects. To further process these objects,
you can collect or pipe them into other scripts.
Uninstall
---------
To uninstall the library, navigate to the library build directory that you
previously ran 'make install' in, and run:
cat install_manifest.txt | xargs rm
Support
=======
Documentation for using this Arm RAN Acceleration Library package is available
on the Arm developer website at:
https://developer.arm.com/solutions/infrastructure/developer-resources/5g/ran
Reference documentation for the supported routines in Arm RAN Acceleration
Library is available at:
https://developer.arm.com/documentation/102249/2101
If you have Doxygen installed on your your system, you can build your own HTML
version of the documentation using CMake. To build the HTML documentation, use:
make docs
If you have any issues with the installation, content, or use of this release,
raise a question on Developer Community Forum at:
https://community.arm.com/developer/f/infrastructure-solutions
These deliverables are being released under the terms of the agreement between
Arm and each licensee (the “Agreement”). All planned verification and
validation is complete. The release is suitable for volume production under
the terms of the Agreement.
Release history
===============
A full release history (with release notes) for Arm RAN Acceleration Library
is available on the Arm developer website:
https://developer.arm.com/solutions/infrastructure/developer-resources/5g/ran/release-history