You copied the Doc URL to your clipboard.

1 Introduction to Arm Forge

Arm Forge combines Arm DDT, the leading parallel debugger for time-saving high-performance application debugging, and Arm MAP, the trusted performance profiler for invaluable optimization advice.

Arm Forge supports many parallel architectures and models, including MPI, UPC, CUDA and OpenMP. Arm Forge is a cross-platform tool, with support for the latest compilers and C++ 11 standards, and Intel, 64-bit Arm, AMD, OpenPOWER and Nvidia GPU hardware.

Arm Forge provides you with everything you need to debug, fix and profile programs at any scale. One common interface makes it easy to move between Arm DDT and Arm MAP during code development.

Arm Forge provides native remote clients for Windows, Mac OS X and Linux. Use a remote client to connect to your cluster, where you can run, debug, profile, edit and compile your application files.

1.1 Arm DDT

Arm DDT is a powerful graphical debugger suitable for many different development environments, including:

  • Single process and multithreaded software.
  • OpenMP.
  • Parallel (MPI) software.
  • Heterogeneous software, for example, GPU software.
  • Hybrid codes mixing paradigms, for example, MPI with OpenMP, or MPI with CUDA.
  • Multi-process software including client-server applications.

Arm DDT helps you to find and fix problems on a single thread or across hundreds of thousands of threads. It includes static analysis to highlight potential code problems, integrated memory debugging to identify reads and writes that are outside of array bounds, and integration with MPI message queues.

Arm DDT supports:

  • C, C++, and all derivatives of Fortran, including Fortran 90.
  • Limited support for Python (CPython 2.7).
  • Parallel languages/models including MPI, UPC, and Fortran 2008 Co-arrays.
  • GPU languages such as HMPP, OpenMP Accelerators, CUDA and CUDA Fortran.

1.1.1 Related information

  • Chapter 5 provides details about getting started with Arm DDT.

1.2 Arm MAP

Arm MAP is a parallel profiler that shows you which lines of code took the most time to run, and why. Arm MAP does not require any complicated configuration, and you do not need to have experience with profiling tools to use it.

Arm MAP supports:

  • MPI, OpenMP and single-threaded programs.
  • Small data files. All data is aggregated on the cluster and only a few megabytes written to disk, regardless of the size or duration of the run.
  • Sophisticated source code view, enabling you to analyze performance across individual functions.
  • Both interactive and batch modes for gathering profile data.
  • A rich set of metrics, that show memory usage, floating-point calculations and MPI usage across processes, including:
    • Percentage of vectorized instructions, including AVX extensions, used in each part of the code.
    • Time spent in memory operations, and how it varies over time and processes, to verify if there are any cache bottlenecks.
    • A visual overview across aggregated processes and cores that highlights any regions of imbalance in the code.

1.2.1 Related information

  • Chapter 16 provides details about getting started with Arm MAP.

1.3 Online resources

You can find tutorials, webinars and white papers on the Arm developer website.

If you have questions or require further support, please get in touch with our dedicated support team.

Get Arm Forge at Arm Forge downloads.