Building NumPy with Arm Compiler

Overview Before you begin Procedure Related Information

How to build NumPy with Arm Compiler.

NumPy is a package for scientific computing with Python. It provides: a powerful N-dimensional array object; sophisticated (broadcasting) functions; tools for integrating C/C++ and Fortran code; useful linear algebra; Fourier transform, and random number capabilities; and much more.

For more information, see the NumPy website.

The following components are used for this build:

Component Version
NumPy 1.15
Python 3.6
Arm Compiler version 19.0
Arm Performance Libraries
Operating system RHEL 7.5
Hardware Cavium ThunderX2

Before you begin


  1. Check out the latest NumPy sources. Run:

    git clone numpy

    and change into the numpy directory:

    cd numpy
  2. Add 'arm' to the list of known C compilers.

    To allow the use of the flag compiler=arm when building NumPy, edit numpy/distutils/ and add an entry for the Arm Compiler class in the function definition CCompiler_cxx_compiler:

    compiler_class['pathcc'] = ('pathccompiler', 'PathScaleCCompiler',
                                "PathScale Compiler for SiCortex-based applications")
    compiler_class['arm'] = ('armccompiler', 'ArmCCompiler',
                               "Arm C Compiler")
    ccompiler._default_compilers += (('linux.*', 'intel'),
                                     ('linux.*', 'intele'),
                                     ('linux.*', 'intelem'),
                                     ('linux.*', 'pathcc'),
                                     ('nt', 'intelw'),
                                     ('nt', 'intelemw'))
  3. To configure for the Arm C/C++ Compiler, add a new file, containing:

    from __future__ import division, absolute_import, print_function
    from distutils.unixccompiler import UnixCCompiler
    class ArmCCompiler(UnixCCompiler):
        Arm compiler.
        compiler_type = 'arm'
        cc_exe = 'armclang'
        cxx_exe = 'armclang++'
        def __init__ (self, verbose=0, dry_run=0, force=0):
            UnixCCompiler.__init__ (self, verbose, dry_run, force)
            cc_compiler = self.cc_exe
            cxx_compiler = self.cxx_exe
            self.set_executables(compiler=cc_compiler + ' -mcpu=native -O3',
                                 compiler_so=cc_compiler + ' -mcpu=native -O3',
                                 compiler_cxx=cxx_compiler + ' -mcpu=native -O3',
                                 linker_exe=cc_compiler + ' -lamath',
                                 linker_so=cc_compiler + ' -lamath -shared')
  4. Provide system_info classes for Arm Performance Libraries, covering BLAS, LAPACK, and FFTW3. Edit the file numpy/distutils/ as follows:

    1. Add entries to the get_info function definition.

      Note: By default, atlas is the first entry.

        cl = {'armpl': armpl_info,
                'blas_armpl': blas_armpl_info,
                'lapack_armpl': lapack_armpl_info,
                'fftw3_armpl' : fftw3_armpl_info,
                'atlas': atlas_info,  # use lapack_opt or blas_opt instead
    2. Add a class definition for fftw3_armpl_info:

      class fftw3_armpl_info(fftw_info):
          section = 'fftw3'
          dir_env_var = 'ARMPL_DIR'
          notfounderror = FFTWNotFoundError
          ver_info = [{'name':'fftw3',
                          'macros':[('SCIPY_FFTW3_H', None)]},

      where ARMPL_DIR is an environment variable providing the path to the location of Arm Performance Libraries. If used, this is set by the Arm Performance Libraries module file, for example for a ThunderX2 platform:

      module load ThunderX2CN99/RHEL/7/arm-hpc-compiler-19.0/armpl/19.0

      or for a generic Armv8-A target:

      module load Generic-AArch64/RHEL/7/arm-hpc-compiler-19.0/armpl/19.0
    3. To get information on using LAPACK and BLAS with ArmPL, add calls. Before the call to get_info('lapack_mkl'), add a clause to catch the use of Arm Performance Libraries and set the lapack_opt_info to 'lapack_armpl_info', and blas_opt_info to 'blas_armpl_info' :

      class lapack_opt_info(system_info):
          notfounderror = LapackNotFoundError
          def calc_info(self):
              lapack_armpl_info = get_info('lapack_armpl')
              if lapack_armpl_info:
      class blas_opt_info(system_info):
          notfounderror = BlasNotFoundError
          def calc_info(self):
              blas_armpl_info = get_info('blas_armpl')
              if blas_armpl_info:
    4. Add the armpl_info class after the mkl_info class:

      class armpl_info(system_info):
          section = 'armpl'
          dir_env_var = 'ARMPL_DIR'
          _lib_armpl = ['armpl_lp64_mp']
          def calc_info(self):
              lib_dirs = self.get_lib_dirs()
              incl_dirs = self.get_include_dirs()
              armpl_libs = self.get_libs('armpl_libs', self._lib_armpl)
              info = self.check_libs2(lib_dirs, armpl_libs)
              if info is None:
                          define_macros=[('SCIPY_MKL_H', None),
                                         ('HAVE_CBLAS', None)],
      class lapack_armpl_info(armpl_info):
      class blas_armpl_info(armpl_info):
  5. Configure NumPy for the Arm Fortran Compiler:

    1. Add 'arm' to the list of known Fortran compilers. In numpy/distutils/fcompiler/__, add an 'arm' entry to the platform mappings in the function definition of wrap_unlinkable_objects:

      _default_compilers = (
          # sys.platform mappings
          ('linux.*', ('arm', 'gnu95', 'intel', 'lahey', 'pg', 'absoft', 'nag', 'vast', 'compaq',
                       'intele', 'intelem', 'gnu', 'g95', 'pathf95', 'nagfor')),
    2. Add an configuration file for the Arm Fortran Compiler.  In numpy/distutils/fcompiler/<file>, create a new file with the following contents:

      from __future__ import division, absolute_import, print_function
      import sys
      from numpy.distutils.fcompiler import FCompiler, dummy_fortran_file
      from sys import platform
      from os.path import join, dirname, normpath
      compilers = ['ArmFlangCompiler']
      import functools
      class ArmFlangCompiler(FCompiler):
          compiler_type = 'arm'
          description = 'Arm Compiler'
          version_pattern = r'\s*Arm.*version (?P<version>[\d.-]+).*'

          ar_exe = 'lib.exe'
          possible_executables = ['armflang']

          executables = {
              'version_cmd': ["", "--version"],
              'compiler_f77': ["armflang"],
              'compiler_fix': ["armflang", "-ffixed-form"],
              'compiler_f90': ["armflang"],
              'linker_so': ["armflang", "-fPIC", "-shared"],
              'archiver': ["ar", "-cr"],
              'ranlib':  None

          pic_flags = ["-fPIC", "-DPIC"]
          c_compiler = 'arm'
          module_dir_switch = '-module '  # Don't remove ending space!

          def get_libraries(self):
              opt = FCompiler.get_libraries(self)
              opt.extend(['flang', 'flangrti', 'ompstub'])
              return opt
          def get_library_dirs(self):
              """List of compiler library directories."""
              opt = FCompiler.get_library_dirs(self)
              flang_dir = dirname(self.executables['compiler_f77'][0])
              opt.append(normpath(join(flang_dir, '..', 'lib')))
              return opt
          def get_flags(self):
              return []
          def get_flags_free(self):
              return []
          def get_flags_debug(self):
              return ['-g']
          def get_flags_opt(self):
              return ['-O3']
          def get_flags_arch(self):
              return []
          def runtime_library_dir_option(self, dir):
              raise NotImplementedError
      if __name__ == '__main__':
          from distutils import log
          from numpy.distutils import customized_fcompiler

  6. Build NumPy:

    Note: Arm suggests first building in an virtual environment before doing a global installation.

    1. (Optional) Create virtual environment. Install virtualenv. If required, use pip3 install virtualenv, then initialize the virtual environment:

      python3 -m venv ~/venv_numpy_armpl
      source ~/venv_numpy_armpl/bin/activate
      pip3 install Cython
    2. Build and install NumPy:

      CC=armclang python3 config --compiler=arm build_clib --compiler=arm build_ext
      CC=armclang python3 config --compiler=arm build_clib --compiler=arm build_ext --compiler=arm install

      You can confirm that NumPy has been built with the Arm Performance libraries using numpy.show_config().

    3. Test the installation. Run the NumPy test suite (this may require the installation of PyTest using pip3 install pytest):

      python3 -c 'import numpy; numpy.test()'

      Note: quad precision is not supported; this will result in a test failure.