You copied the Doc URL to your clipboard.

D Known issues

The most significant known issues for the latest release are summarized here:


The following known issues affect MAP.

  • I/O metrics are not available on some systems, including Cray systems.
  • CPU instruction metrics are only available on x86_64 systems.
  • Thread activity is not sampled whilst a process is inside an MPI call with a duration spanning multiple samples. This can appear as 'uncategorized' (white) time in the Application activity bar when in the Pthread View. The uncategorized time will coincide with long running MPI calls.
  • MAP does not support code that spawns new processes, such as fork, exec and MPI_Comm_spawn. In these cases MAP will only profile the original process.

D.2 XALT Wrapper

The XALT wrapper is known to cause several issues when used in conjunction with Arm Forge, such as:

  • MPI programs cannot be debugged due to a hang during start up.
  • Error messages are reported relating to the permissions on qstat.

For each case, the workaround is to disable the XALT wrapper. To disable the XALT wrapper, unload the XALT module.


MPICH 3.0.3 and 3.0.4 do not work with the Arm Forge due to an MPICH defect. MPICH 3.1 is fully supported.

D.4 Open MPI

Message queue debugging does not work in Open MPI 1.8.1 to 1.8.5. This issue is fixed in Open MPI 1.8.6.

The following versions of Open MPI do not work with Arm Forge because of bugs in the Open MPI debug interface:

  • Open MPI 2.1.0 to 2.1.2.
  • Open MPI 3.0.0 when compiled with the Arm Compiler for HPC on Arm®;v8 (AArch64) systems.
  • Open MPI 3.0.x when compiled with some versions of the GNU compiler on Arm®;v8 (AArch64) systems.
  • Open MPI 3.x when compiled with some versions of IBM XLC/XLF or PGI compilers on IBM Power (PPC64le little-endian, POWER8, or POWER9) systems.
  • Open MPI 3.1.0 and 3.1.1.
  • Open MPI 3.x with any version of PMIx ¡ 2.

To resolve any of the above issues, instead select Open MPI (Compatibility) for the MPI Implementation.

D.4.1 Open MPI 3.x on IBM Power with the GNU compiler

To use Open MPI versions 3.0.0 to 3.0.4 (inclusive) and Open MPI versions 3.1.0 to 3.1.3 (inclusive) with the GNU compiler on IBM Power systems, you might need to configure the Open MPI build with CFLAGS=-fasynchronous-unwind-tables. Configuring the Open MPI build with CFLAGS=-fasynchronous-unwind-tables fixes a startup bug where Arm Forge is unable to step out of MPI_Init into your main function. The startup bug occurs because of missing debug information and optimization in the Open MPI library. If you already configure with -g, you do not need to add this extra flag. An example configure command is:

    ./configure --prefix=/software/openmpi-3.1.2 CFLAGS=-fasynchronous-unwind-tables

If you do not have the option to recompile your MPI, an alternative workaround is to select Open MPI (Compatibility) for the MPI Implementation. This issue is fixed in later versions.


The following known issues affect CUDA:

  • To debug or profile a CUDA program, compile the program with a version of the CUDA toolkit that matches the version of the installed CUDA driver. For example, if the CUDA 8.0 driver is installed, then you must use the CUDA 8.0 toolkit to compile your program.


    Compiling with mismatched CUDA toolkit and CUDA driver versions will cause errors when debugging or profiling.


    To force DDT to use a particular version of the CUDA debugger, set the ALLINEA_FORCE_CUDA_VERSION environment variable to a version number. For example, ALLINEA_FORCE_CUDA_VERSION=8.0 for CUDA 8.0. This may cause issues due to CUDA version incompatibilities.

  • GPU profiling is only supported when using a CUDA 8.0 toolkit with a CUDA 8.0 driver.
  • Cray CCE 8.1.2 OpenACC and previous releases will fail to generate debug information for local variables in accelerated regions. Please install CCE 8.1.3.
  • When debugging a CUDA application, adding watchpoints on either host or kernel code is not supported.
  • When debugging a CUDA application, using the Step threads together box and Run to here to step into OpenMP regions is not supported. Breakpoints can be used to stop at the desired line.
  • Stepping multiple warps simultaneously (e.g. those in the same block or kernel) is not supported in CUDA 9.0 and above. Individual warps can be stepped sequentially to achieve the same effect.
  • When CUDA is set to Detect invalid accesses (memcheck), placing breakpoints in CUDA kernels is only supported in CUDA 10.1 or later.
  • A driver issue in CUDA 9.1 prevents DDT from debugging CUDA GPU applications on Cray machines using Cray MPT (aprun). As a workaround launch the CUDA application outside of DDT and attach to it.


On Cray X-series systems only native SLURM is supported, hybrid mode is not supported.

D.7 PGI compilers

Version 14.9 or later of the PGI compilers is required to compile the Arm MAP MPI wrappers as a static library.

D.8 64-bit Arm/Power platforms

For best operation, DDT and MAP require debug symbols for the runtime libraries to be installed in addition to debug symbols for the program itself.

D.9 F1 user guide

Sometimes on pressing "F1" the user guide may not display correctly. Some stale files appear to be able to corrupt the document browser. If "F1" leads to invisible documents, please remove these cached files by typing:

rm -r ∼/.local/share/data/Arm

D.10 See also

See also additional known issues here: