GROMACS is a versatile package to perform molecular dynamics, for example to simulate the Newtonian equations of motion for systems with hundreds to millions of particles.

For more information on GROMACS, visit the GROMACS webpage.

Version information

The following components are used in this build:

GROMACS Version 2018.1
Open MPI Version 3.0.2
Operating System
Ubuntu 16.04
Hardware Cavium ThunderX2
Arm Compiler for HPC
Version 18.2
Arm Performance Libraries Version 18.2

Recipes for other versions of the application are available in the GitLab Packages Wiki.


Download the GROMACS source files by cloning repositories, using:

git clone git://git.gromacs.org/gromacs.git gromacs

Change into the newly created gromacs directory and checkout the 2018 release version, using:

cd gromacs
git checkout ${gromacs_ver} -b Release

Note: This checks out the release version into a local branch called Release, but the name can be user-defined.


Tool configuration

  1. Ensure that your paths are set up appropriately for Arm Compiler for HPC:

    export PATH=/path/to/ArmCompiler/bin:/path/to/MPI/install/bin:$PATH
    export LD_LIBRARY_PATH=/path/to/ArmCompiler/lib:/path/to/MPI/install/lib:$LD_LIBRARY_PAT

    replacing /path/to/ArmCompiler/ with the appropriate path for your installation directories.

  2. Ensure $ARMPL_DIR is set to your Arm Performance Libraries installation directory:

    export ARMPL_DIR=/path/to/ARMPL_install_dir/

     replacing /path/to/ARMPL_DIR/with the appropriate path for your installation directory.

Build and install GROMACS with Arm Compiler and Arm Performance Libraries

  1. Within the gromacs directory, setup your locations, for example:

    export gromacs_root=$PWD
    export gromacs_build=$gromacs_root/build
    export gromacs_install=$gromacs_root/$gromacs_ver
  2. Create new directories called build and install:

    mkdir $gromacs_build
    mkdir $gromacs_install

    Keeping builds in separate directories is optional when building open source software, but is considered best practice. Having multiple build directories side-by-side is often useful for testing and debugging.

  3. Change into the build directory:

    cd $gromacs_build
  4. Run cmake with the following arguments:

    cmake \
    -DCMAKE_INSTALL_PREFIX=${gromacs_install} \
    -DCMAKE_C_COMPILER=armclang \
    -DCMAKE_CXX_COMPILER=armclang++ \
    -DGMX_DOUBLE=off \
    -DGMX_FFT_LIBRARY=fftw3 \
    -DGMX_BLAS_USER=${ARMPL_DIR}/lib/libarmpl_lp64.so \
    -DGMX_LAPACK_USER=${ARMPL_DIR}/lib/libarmpl_lp64.so \
    -DFFTWF_LIBRARY="${ARMPL_DIR}/lib/libarmpl_lp64.so" \
    -DGMX_GPU=off \
    -DGMX_MPI=on \
    -DGMX_OPENMP=on \
    -DGMX_X11=off \
  5. Invoke make to build GROMACS with Arm Compiler for HPC.

    To improve the build speed, parallelize the make build job using the -j flag. Common practice is to use 2N+1 parallel jobs, where N=number of threads (or cores) on your system. For example, if the combined total of CPUs on your system is 64 threads, passing -j129 will enable a fast build:

    make -j 129 V=1

    To show the command lines passed to the compiler for each object file being compiled or linked, the optional V=1 flag is used.

  6. Install the build:

    make install
  7. Build the tests:

    make -i check

(Optional) Optimize

Performance recommendations for Cavium ThunderX2 hardware:

  1. Configure GROMACS to force the use of tabulated Ewald non-bonded kernels by setting the GMX_NBNXN_EWALD_TABLE environment variable to 1.

  2. Use 4 threads per core.

Run a simple benchmark

To download an example benchmark:

  1. Make a new directory within GROMACS called tests and change into it:

    cd $gromacs_root/tests
  2. Download a test package, for example:

    wget -N http://www.prace-ri.eu/UEABS/GROMACS/1.2/GROMACS_TestCaseA.tar.gz
  3. Unzip the package:

    tar -zxvf GROMACS_TestCaseA.tar.gz
  4. Run the benchmark using 40 cores, single-threaded:

    OMP_NUM_THREADS=1 mpirun -np 40 ${gromacs_install}/bin/gmx_mpi mdrun -s ion_channel.tpr -maxh 0.50 -resethway -noconfout -nsteps 10000 -pin on -npme 0 -g logfile

    Note: the benchmark will take in the region of 10s of minutes, but this will vary from system to system.