NumPy

Building NumPy with Arm Compiler


Overview Before you begin Procedure Related Information

How to build NumPy with Arm Compiler.

NumPy is a package for scientific computing with Python. It provides: a powerful N-dimensional array object; sophisticated (broadcasting) functions; tools for integrating C/C++ and Fortran code; useful linear algebra; Fourier transform, and random number capabilities; and much more.

For more information, see the NumPy website.

The following components are used for this build:

Component Version
NumPy 1.15
Python 3.6
Arm Compiler version 19.0
Arm Performance Libraries
19.0
Operating system RHEL 7.5
Hardware Cavium ThunderX2

Before you begin

Procedure

  1. Check out the latest NumPy sources. Run:

    git clone https://github.com/numpy/numpy.git numpy

    and change into the numpy directory:

    cd numpy
  2. Add 'arm' to the list of known C compilers.

    To allow the use of the flag compiler=arm when building NumPy, edit numpy/distutils/ccompiler.py and add an entry for the Arm Compiler class in the function definition CCompiler_cxx_compiler:

    compiler_class['pathcc'] = ('pathccompiler', 'PathScaleCCompiler',
                                "PathScale Compiler for SiCortex-based applications")
    compiler_class['arm'] = ('armccompiler', 'ArmCCompiler',
                               "Arm C Compiler")
    ccompiler._default_compilers += (('linux.*', 'intel'),
                                     ('linux.*', 'intele'),
                                     ('linux.*', 'intelem'),
                                     ('linux.*', 'pathcc'),
                                     ('nt', 'intelw'),
                                     ('nt', 'intelemw'))
  3. To configure for the Arm C/C++ Compiler, add a new armccompiler.py file, containing:

    from __future__ import division, absolute_import, print_function
     
    from distutils.unixccompiler import UnixCCompiler
     
    class ArmCCompiler(UnixCCompiler):
     
        """
        Arm compiler.
        """
     
        compiler_type = 'arm'
        cc_exe = 'armclang'
        cxx_exe = 'armclang++'
     
        def __init__ (self, verbose=0, dry_run=0, force=0):
            UnixCCompiler.__init__ (self, verbose, dry_run, force)
            cc_compiler = self.cc_exe
            cxx_compiler = self.cxx_exe
            self.set_executables(compiler=cc_compiler + ' -mcpu=native -O3',
                                 compiler_so=cc_compiler + ' -mcpu=native -O3',
                                 compiler_cxx=cxx_compiler + ' -mcpu=native -O3',
                                 linker_exe=cc_compiler + ' -lamath',
                                 linker_so=cc_compiler + ' -lamath -shared')
  4. Provide system_info classes for Arm Performance Libraries, covering BLAS, LAPACK, and FFTW3. Edit the file numpy/distutils/system_info.py as follows:

    1. Add entries to the get_info function definition.

      Note: By default, atlas is the first entry.

        cl = {'armpl': armpl_info,
                'blas_armpl': blas_armpl_info,
                'lapack_armpl': lapack_armpl_info,
                'fftw3_armpl' : fftw3_armpl_info,
                'atlas': atlas_info,  # use lapack_opt or blas_opt instead
    2. Add a class definition for fftw3_armpl_info:

      class fftw3_armpl_info(fftw_info):
          section = 'fftw3'
          dir_env_var = 'ARMPL_DIR'
          notfounderror = FFTWNotFoundError
          ver_info = [{'name':'fftw3',
                          'libs':['armpl_lp64_mp'],
                          'includes':['fftw3.h'],
                          'macros':[('SCIPY_FFTW3_H', None)]},
                        ]

      where ARMPL_DIR is an environment variable providing the path to the location of Arm Performance Libraries. If used, this is set by the Arm Performance Libraries module file, for example for a ThunderX2 platform:

      module load ThunderX2CN99/RHEL/7/arm-hpc-compiler-19.0/armpl/19.0

      or for a generic Armv8-A target:

      module load Generic-AArch64/RHEL/7/arm-hpc-compiler-19.0/armpl/19.0
      
    3. To get information on using LAPACK and BLAS with ArmPL, add calls. Before the call to get_info('lapack_mkl'), add a clause to catch the use of Arm Performance Libraries and set the lapack_opt_info to 'lapack_armpl_info', and blas_opt_info to 'blas_armpl_info' :

      class lapack_opt_info(system_info):
      
          notfounderror = LapackNotFoundError
      
          def calc_info(self):
      
              lapack_armpl_info = get_info('lapack_armpl')
              if lapack_armpl_info:
                  self.set_info(**lapack_armpl_info)
                  return
      class blas_opt_info(system_info):
      
          notfounderror = BlasNotFoundError
      
          def calc_info(self):
      
              blas_armpl_info = get_info('blas_armpl')
              if blas_armpl_info:
                  self.set_info(**blas_armpl_info)
                  return
    4. Add the armpl_info class after the mkl_info class:

      class armpl_info(system_info):
          section = 'armpl'
          dir_env_var = 'ARMPL_DIR'
          _lib_armpl = ['armpl_lp64_mp']
      
          def calc_info(self):
              lib_dirs = self.get_lib_dirs()
              incl_dirs = self.get_include_dirs()
              armpl_libs = self.get_libs('armpl_libs', self._lib_armpl)
              info = self.check_libs2(lib_dirs, armpl_libs)
              if info is None:
                  return
              dict_append(info,
                          define_macros=[('SCIPY_MKL_H', None),
                                         ('HAVE_CBLAS', None)],
                          include_dirs=incl_dirs)
              self.set_info(**info)
      
      class lapack_armpl_info(armpl_info):
          pass
      
      class blas_armpl_info(armpl_info):
          pass
  5. Configure NumPy for the Arm Fortran Compiler:

    1. Add 'arm' to the list of known Fortran compilers. In numpy/distutils/fcompiler/__ init__.py, add an 'arm' entry to the platform mappings in the function definition of wrap_unlinkable_objects:

      _default_compilers = (
          # sys.platform mappings
          ('linux.*', ('arm', 'gnu95', 'intel', 'lahey', 'pg', 'absoft', 'nag', 'vast', 'compaq',
                       'intele', 'intelem', 'gnu', 'g95', 'pathf95', 'nagfor')),
    2. Add an arm.py configuration file for the Arm Fortran Compiler.  In numpy/distutils/fcompiler/<file>, create a new arm.py file with the following contents:

      from __future__ import division, absolute_import, print_function
       
      import sys
       
      from numpy.distutils.fcompiler import FCompiler, dummy_fortran_file
      from sys import platform
      from os.path import join, dirname, normpath
       
      compilers = ['ArmFlangCompiler']
       
      import functools
       
      class ArmFlangCompiler(FCompiler):
          compiler_type = 'arm'
          description = 'Arm Compiler'
          version_pattern = r'\s*Arm.*version (?P<version>[\d.-]+).*'

          ar_exe = 'lib.exe'
          possible_executables = ['armflang']

          executables = {
              'version_cmd': ["", "--version"],
              'compiler_f77': ["armflang"],
              'compiler_fix': ["armflang", "-ffixed-form"],
              'compiler_f90': ["armflang"],
              'linker_so': ["armflang", "-fPIC", "-shared"],
              'archiver': ["ar", "-cr"],
              'ranlib':  None
          }

          pic_flags = ["-fPIC", "-DPIC"]
          c_compiler = 'arm'
          module_dir_switch = '-module '  # Don't remove ending space!

          def get_libraries(self):
              opt = FCompiler.get_libraries(self)
              opt.extend(['flang', 'flangrti', 'ompstub'])
              return opt
       
          @functools.lru_cache(maxsize=128)
          def get_library_dirs(self):
              """List of compiler library directories."""
              opt = FCompiler.get_library_dirs(self)
              flang_dir = dirname(self.executables['compiler_f77'][0])
              opt.append(normpath(join(flang_dir, '..', 'lib')))
       
              return opt
       
          def get_flags(self):
              return []
       
          def get_flags_free(self):
              return []
       
          def get_flags_debug(self):
              return ['-g']
       
          def get_flags_opt(self):
              return ['-O3']
       
          def get_flags_arch(self):
              return []
       
          def runtime_library_dir_option(self, dir):
              raise NotImplementedError
       
       
      if __name__ == '__main__':
          from distutils import log
          log.set_verbosity(2)
          from numpy.distutils import customized_fcompiler
          print(customized_fcompiler(compiler='armflang').get_version())

  6. Build NumPy:

    Note: Arm suggests first building in an virtual environment before doing a global installation.

    1. (Optional) Create virtual environment. Install virtualenv. If required, use pip3 install virtualenv, then initialize the virtual environment:

      python3 -m venv ~/venv_numpy_armpl
      source ~/venv_numpy_armpl/bin/activate
      pip3 install Cython
    2. Build and install NumPy:

      CC=armclang python3 setup.py config --compiler=arm build_clib --compiler=arm build_ext
      CC=armclang python3 setup.py config --compiler=arm build_clib --compiler=arm build_ext --compiler=arm install

      You can confirm that NumPy has been built with the Arm Performance libraries using numpy.show_config().

    3. Test the installation. Run the NumPy test suite (this may require the installation of PyTest using pip3 install pytest):

      python3 -c 'import numpy; numpy.test()'

      Note: quad precision is not supported; this will result in a test failure.