NumPy

Building NumPy with Arm Compiler


Overview Before you begin Procedure Related Information

How to build NumPy with Arm Compiler.

NumPy is a package for scientific computing with Python. It provides: a powerful N-dimensional array object; sophisticated (broadcasting) functions; tools for integrating C/C++ and Fortran code; useful linear algebra; Fourier transform, and random number capabilities; and much more.

For more information, see the NumPy website.

The following components are used for this build:

Component Version
NumPy 1.17.x
Python 3.6
Arm Compiler version 19.2
Arm Performance Libraries
19.2
Operating system RHEL 7.5
Hardware Cavium ThunderX2

Before you begin

Procedure

  1. Clone the NumPy repository. Run:

    git clone https://github.com/numpy/numpy.git numpy

    and change into the numpy directory:

    cd numpy
  2. Check out the 1.17 branch.

    git checkout remotes/origin/maintenance/1.17.x -b v1.17
  3. Add 'arm' to the list of known C compilers.

    To allow the use of the flag compiler=arm during the NumPy build, edit numpy/distutils/ccompiler.py and add an entry for the Arm Compiler class in the function definition CCompiler_cxx_compiler:

    compiler_class['pathcc'] = ('pathccompiler', 'PathScaleCCompiler',
                                "PathScale Compiler for SiCortex-based applications")
    compiler_class['arm'] = ('armccompiler', 'ArmCCompiler',
                               "Arm C Compiler")
    ccompiler._default_compilers += (('linux.*', 'intel'),
                                     ('linux.*', 'intele'),
                                     ('linux.*', 'intelem'),
                                     ('linux.*', 'pathcc'),
                                     ('nt', 'intelw'),
                                     ('nt', 'intelemw'))
  4. To configure for the Arm C/C++ Compiler, add a new armccompiler.py file, containing:

    from __future__ import division, absolute_import, print_function
     
    from distutils.unixccompiler import UnixCCompiler
     
    class ArmCCompiler(UnixCCompiler):
     
        """
        Arm compiler.
        """
     
        compiler_type = 'arm'
        cc_exe = 'armclang'
        cxx_exe = 'armclang++'
     
        def __init__ (self, verbose=0, dry_run=0, force=0):
            UnixCCompiler.__init__ (self, verbose, dry_run, force)
            cc_compiler = self.cc_exe
            cxx_compiler = self.cxx_exe
            self.set_executables(compiler=cc_compiler + ' -mcpu=native -O3',
                                 compiler_so=cc_compiler + ' -mcpu=native -O3',
                                 compiler_cxx=cxx_compiler + ' -mcpu=native -O3',
                                 linker_exe=cc_compiler + ' -lamath',
                                 linker_so=cc_compiler + ' -lamath -shared')
  5. Provide system_info classes for Arm Performance Libraries, covering BLAS, LAPACK, and FFTW3. Edit the file numpy/distutils/system_info.py as follows:

    1. Add entries to the get_info function definition.

      Note: By default, atlas is the first entry.

        cl = {'armpl': armpl_info,
                'blas_armpl': blas_armpl_info,
                'lapack_armpl': lapack_armpl_info,
                'fftw3_armpl' : fftw3_armpl_info,
                'atlas': atlas_info,  # use lapack_opt or blas_opt instead
    2. Add a class definition for fftw3_armpl_info:

      class fftw3_armpl_info(fftw_info):
          section = 'fftw3'
          dir_env_var = 'ARMPL_DIR'
          notfounderror = FFTWNotFoundError
          ver_info = [{'name':'fftw3',
                          'libs':['armpl_lp64_mp'],
                          'includes':['fftw3.h'],
                          'macros':[('SCIPY_FFTW3_H', None)]},
                        ]

      where ARMPL_DIR is an environment variable providing the path to the location of Arm Performance Libraries. If used, this is set by the Arm Performance Libraries module file, for example for a ThunderX2 platform:

      module load ThunderX2CN99/RHEL/7/arm-hpc-compiler-19.2/armpl/19.2

      or for a generic Armv8-A target:

      module load Generic-AArch64/RHEL/7/arm-hpc-compiler-19.2/armpl/19.2
      
  6. To get information on using LAPACK and BLAS with ArmPL, add calls:

    1. Before the call to get_info('lapack_mkl'), in the class definition of lapack_opt_info, add armpl to lapack_order:

      # Default order of LAPACK checks
      lapack_order = ['armpl','mkl', 'openblas', 'flame', 'atlas', 'accelerate', 'lapack']
    2. Add _calc_info_armpl:

      def _calc_info_armpl(self):
          info = get_info('lapack_armpl')
          if info:
              self.set_info(**info)
              return True
          return False
              
    3. In the class definition of blas_opt_info, add:

      class blas_opt_info(system_info):

      notfounderror = BlasNotFoundError
      # Default order of BLAS checks
      blas_order = ['armpl', 'mkl', 'blis', 'openblas', 'atlas', 'accelerate', 'blas']

      def _calc_info_armpl(self):
      info = get_info('blas_armpl')
      if info:
      self.set_info(**info)
      return True
      return False
    4. Add the armpl_info class after the mkl_info class:
      class armpl_info(system_info):
          section = 'armpl'
          dir_env_var = 'ARMPL_DIR'
          _lib_armpl = ['armpl_lp64_mp']
      
          def calc_info(self):
              lib_dirs = self.get_lib_dirs()
              incl_dirs = self.get_include_dirs()
              armpl_libs = self.get_libs('armpl_libs', self._lib_armpl)
              info = self.check_libs2(lib_dirs, armpl_libs)
              if info is None:
                  return
              dict_append(info,
                          define_macros=[('SCIPY_MKL_H', None),
                                         ('HAVE_CBLAS', None)],
                          include_dirs=incl_dirs)
              self.set_info(**info)
      
      class lapack_armpl_info(armpl_info):
          pass
      
      class blas_armpl_info(armpl_info):
          pass
  7. Configure NumPy for the Arm Fortran Compiler:

    1. Add 'arm' to the list of known Fortran compilers. In numpy/distutils/fcompiler/__ init__.py, add an 'arm' entry to the platform mappings in the function definition of wrap_unlinkable_objects:

      _default_compilers = (
          # sys.platform mappings
          ('linux.*', ('arm', 'gnu95', 'intel', 'lahey', 'pg', 'absoft', 'nag', 'vast', 'compaq',
                       'intele', 'intelem', 'gnu', 'g95', 'pathf95', 'nagfor')),
    2. Add an arm.py configuration file for the Arm Fortran Compiler.  In numpy/distutils/fcompiler/<file>, create a new arm.py file with the following contents:

      from __future__ import division, absolute_import, print_function

      import sys

      from numpy.distutils.fcompiler import FCompiler, dummy_fortran_file
      from sys import platform
      from os.path import join, dirname, normpath

      compilers = ['ArmFlangCompiler']

      import functools

      class ArmFlangCompiler(FCompiler):
      compiler_type = 'arm'
      description = 'Arm Compiler'
      version_pattern = r'\s*Arm.*version (?P<version>[\d.-]+).*'

      ar_exe = 'lib.exe'
      possible_executables = ['armflang']

      executables = {
      'version_cmd': ["", "--version"],
      'compiler_f77': ["armflang"],
      'compiler_fix': ["armflang", "-ffixed-form"],
      'compiler_f90': ["armflang"],
      'linker_so': ["armflang", "-fPIC", "-shared"],
      'archiver': ["ar", "-cr"],
      'ranlib': None
      }

      pic_flags = ["-fPIC", "-DPIC"]
      c_compiler = 'arm'
      module_dir_switch = '-module ' # Don't remove ending space!

      def get_libraries(self):
      opt = FCompiler.get_libraries(self)
      opt.extend(['flang', 'flangrti', 'ompstub'])
      return opt

      @functools.lru_cache(maxsize=128)
      def get_library_dirs(self):
      """List of compiler library directories."""
      opt = FCompiler.get_library_dirs(self)
      flang_dir = dirname(self.executables['compiler_f77'][0])
      opt.append(normpath(join(flang_dir, '..', 'lib')))

      return opt

      def get_flags(self):
      return []

      def get_flags_free(self):
      return []

      def get_flags_debug(self):
      return ['-g']

      def get_flags_opt(self):
      return ['-O3']

      def get_flags_arch(self):
      return []

      def runtime_library_dir_option(self, dir):
      raise NotImplementedError


      if __name__ == '__main__':
      from distutils import log
      log.set_verbosity(2)
      from numpy.distutils import customized_fcompiler
      print(customized_fcompiler(compiler='armflang').get_version())
  8. Build NumPy:

    Note: Arm suggests first building in an virtual environment before doing a global installation.

     

    1. (Optional) Create virtual environment. Install virtualenv. If required, use pip3 install virtualenv, then initialize the virtual environment:

      python3 -m venv ~/venv_numpy_armpl
      source ~/venv_numpy_armpl/bin/activate
      pip3 install Cython
    2. Build and install NumPy:

      CC=armclang python3 setup.py config --compiler=arm build_clib --compiler=arm build_ext
      CC=armclang python3 setup.py config --compiler=arm build_clib --compiler=arm build_ext --compiler=arm install

      To confirm that NumPy has been built with the Arm Performance libraries, use numpy.show_config().

    3. Test the installation. Run the NumPy test suite (this may require the installation of PyTest using pip3 install pytest):

      python3 -c 'import numpy; numpy.test()'

      Note: quad precision is not supported; this will result in a test failure.