Building NumPy with Arm Compiler

Overview Before you begin Procedure Related Information


  1. Check out the latest NumPy sources. Run:

    git clone numpy

    and change into the numpy directory:

    cd numpy
  2. Add 'arm' to the list of known C compilers.

    To allow the use of the flag compiler=arm when building NumPy, edit numpy/distutils/ and add an entry for the Arm Compiler class in the function definition CCompiler_cxx_compiler:

    compiler_class['pathcc'] = ('pathccompiler', 'PathScaleCCompiler',
                                "PathScale Compiler for SiCortex-based applications")
    compiler_class['arm'] = ('armccompiler', 'ArmCCompiler',
                               "Arm C Compiler")
    ccompiler._default_compilers += (('linux.*', 'intel'),
                                     ('linux.*', 'intele'),
                                     ('linux.*', 'intelem'),
                                     ('linux.*', 'pathcc'),
                                     ('nt', 'intelw'),
                                     ('nt', 'intelemw'))
  3. To configure for the Arm C/C++ Compiler, add a new file, containing:

    from __future__ import division, absolute_import, print_function
    from distutils.unixccompiler import UnixCCompiler
    class ArmCCompiler(UnixCCompiler):
        Arm compiler.
        compiler_type = 'arm'
        cc_exe = 'armclang'
        cxx_exe = 'armclang++'
        def __init__ (self, verbose=0, dry_run=0, force=0):
            UnixCCompiler.__init__ (self, verbose, dry_run, force)
            cc_compiler = self.cc_exe
            cxx_compiler = self.cxx_exe
            self.set_executables(compiler=cc_compiler + ' -mcpu=native -O3',
                                 compiler_so=cc_compiler + ' -mcpu=native -O3',
                                 compiler_cxx=cxx_compiler + ' -mcpu=native -O3',
                                 linker_exe=cc_compiler + ' -lamath',
                                 linker_so=cc_compiler + ' -lamath -shared')
  4. Provide system_info classes for Arm Performance Libraries, covering BLAS, LAPACK, and FFTW3. Edit the file numpy/distutils/ as follows:

    1. Add entries to the get_info function definition.

      Note: By default, atlas is the first entry.

        cl = {'armpl': armpl_info,
                'blas_armpl': blas_armpl_info,
                'lapack_armpl': lapack_armpl_info,
                'fftw3_armpl' : fftw3_armpl_info,
                'atlas': atlas_info,  # use lapack_opt or blas_opt instead
    2. Add a class definition for fftw3_armpl_info:

      class fftw3_armpl_info(fftw_info):
          section = 'fftw3'
          dir_env_var = 'ARMPL_DIR'
          notfounderror = FFTWNotFoundError
          ver_info = [{'name':'fftw3',
                          'macros':[('SCIPY_FFTW3_H', None)]},

      where ARMPL_DIR is an environment variable providing the path to the location of Arm Performance Libraries. If used, this is set by the Arm Performance Libraries module file, for example for a ThunderX2 platform:

      module load ThunderX2CN99/RHEL/7/arm-hpc-compiler-19.0/armpl/19.0

      or for a generic Armv8-A target:

      module load Generic-AArch64/RHEL/7/arm-hpc-compiler-19.0/armpl/19.0
    3. To get information on using LAPACK and BLAS with ArmPL, add calls. Before the call to get_info('lapack_mkl'), add a clause to catch the use of Arm Performance Libraries and set the lapack_opt_info to 'lapack_armpl_info', and blas_opt_info to 'blas_armpl_info' :

      class lapack_opt_info(system_info):
          notfounderror = LapackNotFoundError
          def calc_info(self):
              lapack_armpl_info = get_info('lapack_armpl')
              if lapack_armpl_info:
      class blas_opt_info(system_info):
          notfounderror = BlasNotFoundError
          def calc_info(self):
              blas_armpl_info = get_info('blas_armpl')
              if blas_armpl_info:
    4. Add the armpl_info class after the mkl_info class:

      class armpl_info(system_info):
          section = 'armpl'
          dir_env_var = 'ARMPL_DIR'
          _lib_armpl = ['armpl_lp64_mp']
          def calc_info(self):
              lib_dirs = self.get_lib_dirs()
              incl_dirs = self.get_include_dirs()
              armpl_libs = self.get_libs('armpl_libs', self._lib_armpl)
              info = self.check_libs2(lib_dirs, armpl_libs)
              if info is None:
                          define_macros=[('SCIPY_MKL_H', None),
                                         ('HAVE_CBLAS', None)],
      class lapack_armpl_info(armpl_info):
      class blas_armpl_info(armpl_info):
  5. Configure NumPy for the Arm Fortran Compiler:

    1. Add 'arm' to the list of known Fortran compilers. In numpy/distutils/fcompiler/__, add an 'arm' entry to the platform mappings in the function definition of wrap_unlinkable_objects:

      _default_compilers = (
          # sys.platform mappings
          ('linux.*', ('arm', 'gnu95', 'intel', 'lahey', 'pg', 'absoft', 'nag', 'vast', 'compaq',
                       'intele', 'intelem', 'gnu', 'g95', 'pathf95', 'nagfor')),
    2. Add an configuration file for the Arm Fortran Compiler.  In numpy/distutils/fcompiler/<file>, create a new file with the following contents:

      from __future__ import division, absolute_import, print_function
      import sys
      from numpy.distutils.fcompiler import FCompiler, dummy_fortran_file
      from sys import platform
      from os.path import join, dirname, normpath
      compilers = ['ArmFlangCompiler']
      import functools
      class ArmFlangCompiler(FCompiler):
          compiler_type = 'arm'
          description = 'Arm Compiler'
          version_pattern = r'\s*Arm.*version (?P<version>[\d.-]+).*'

          ar_exe = 'lib.exe'
          possible_executables = ['armflang']

          executables = {
              'version_cmd': ["", "--version"],
              'compiler_f77': ["armflang"],
              'compiler_fix': ["armflang", "-ffixed-form"],
              'compiler_f90': ["armflang"],
              'linker_so': ["armflang", "-fPIC", "-shared"],
              'archiver': ["ar", "-cr"],
              'ranlib':  None

          pic_flags = ["-fPIC", "-DPIC"]
          c_compiler = 'arm'
          module_dir_switch = '-module '  # Don't remove ending space!

          def get_libraries(self):
              opt = FCompiler.get_libraries(self)
              opt.extend(['flang', 'flangrti', 'ompstub'])
              return opt
          def get_library_dirs(self):
              """List of compiler library directories."""
              opt = FCompiler.get_library_dirs(self)
              flang_dir = dirname(self.executables['compiler_f77'][0])
              opt.append(normpath(join(flang_dir, '..', 'lib')))
              return opt
          def get_flags(self):
              return []
          def get_flags_free(self):
              return []
          def get_flags_debug(self):
              return ['-g']
          def get_flags_opt(self):
              return ['-O3']
          def get_flags_arch(self):
              return []
          def runtime_library_dir_option(self, dir):
              raise NotImplementedError
      if __name__ == '__main__':
          from distutils import log
          from numpy.distutils import customized_fcompiler

  6. Build NumPy:

    Note: Arm suggests first building in an virtual environment before doing a global installation.

    1. (Optional) Create virtual environment. Install virtualenv. If required, use pip3 install virtualenv, then initialize the virtual environment:

      python3 -m venv ~/venv_numpy_armpl
      source ~/venv_numpy_armpl/bin/activate
      pip3 install Cython
    2. Build and install NumPy:

      CC=armclang python3 config --compiler=arm build_clib --compiler=arm build_ext
      CC=armclang python3 config --compiler=arm build_clib --compiler=arm build_ext --compiler=arm install

      You can confirm that NumPy has been built with the Arm Performance libraries using numpy.show_config().

    3. Test the installation. Run the NumPy test suite (this may require the installation of PyTest using pip3 install pytest):

      python3 -c 'import numpy; numpy.test()'

      Note: quad precision is not supported; this will result in a test failure.

Previous Next