You copied the Doc URL to your clipboard.

5 Getting started

When compiling the program that you wish to debug, you must add the debug flag to your compile command. For most compilers this is -g.

It is also advisable to turn off compiler optimizations as these can make debugging appear strange and unpredictable. If your program is already compiled without debug information you will need to make the files that you are interested in again.

The Welcome Page allows you to choose what kind of debugging you want to do, for example you can:

  • Run a program from DDT and debug it.
  • Debug a program you launch manually (for example, on the command line).
  • Attach to an already running program.
  • Open core files generated by a program that crashed.
  • Connect to a remote system and accept a Reverse Connect request.

5.1 Running a program


Figure 12: Run Window

If you click the Run button on the Welcome Page you see the window above. The settings are grouped into sections. Click the Details… button to expand a section. The settings in each section are described below.

5.1.1 Application

Application: The full path name to your application. If you specified one on the command line, this is filled in. You may browse for an application by clicking on the Browse PIC button.

Note: Many MPIs have problems working with directory and program names containing spaces. You are advised to avoid the use of spaces in directory and file names.

Arguments: (optional) The arguments passed to your application. These are automatically filled if you entered some on the command line.

Note: Avoid using quote characters such as ' and ", as these may be interpreted differently by DDT and your command shell. If you must use these and cannot get them to work as expected, please contact Arm support at Arm support.

stdin file: (optional) This allows you to choose a file to be used as the standard input (stdin) for your program. DDT automatically adds arguments to mpirun to ensure your input file is used.

Working Directory: (optional) The working directory to use when debugging your application. If this is blank then DDT's working directory is used instead.

5.1.2 MPI

Note: If you only have a single process license or have selected none as your MPI Implementation the MPI options will be missing. The MPI options are not available when DDT is in single process mode. See section 5.4 Debugging single-process programs for more details about using DDT with a single process.

Number of processes: The number of processes that you wish to debug. DDT supports hundreds of thousands of processes but this is limited by your license.

Number of nodes: This is the number of compute nodes that you wish to use to run your program.

Processes per node: This is the number of MPI processes to run on each compute node.

Implementation: The MPI implementation to use. If you are submitting a job to a queue the queue settings will also be summarized here. You may change the MPI implementation by clicking on the Change… button.


  • The choice of MPI implementation is critical to correctly starting DDT. Your system will normally use one particular MPI implementation. If you are unsure as to which to pick, try generic, consult your system administrator or Arm support. A list of settings for common implementations is provided in Appendix E MPI distribution notes and known issues.
  • If your desired MPI command is not in your PATH, or you wish to use an MPI run command that is not your default one, you can configure this using the Options window (See section A.5.1 System).

mpirun arguments: (optional): The arguments that are passed to mpirun or your equivalent, usually prior to your executable name in normal mpirun usage. You can place machine file arguments, if necessary, here. For most users this box can be left empty. You can also specify mpirun arguments on the command line (using the --mpiargs command line argument) or using the ALLINEA_MPIRUN_ARGUMENTS environment variable if this is more convenient.


  • You should not enter the -np argument as DDT will do this for you.
  • You should not enter the --task-nb or --process-nb arguments as DDT will do this for you.

5.1.3 OpenMP

Number of OpenMP threads: The number of OpenMP threads to run your application with. The OMP_NUM_THREADS environment variable is set to this value.

5.1.4 CUDA

If your license supports it, you may also debug GPU programs by enabling CUDA support. For more information on debugging CUDA programs, please see section 14 CUDA GPU debugging.

Track GPU Allocations: Tracks CUDA memory allocations made using cudaMalloc, and similar methods. See 12.2 CUDA memory debugging for more information.

Detect invalid accesses (memcheck): Turns on the CUDA-MEMCHECK error detection tool. See 12.2 CUDA memory debugging for more information.

5.1.5 Memory debugging

Clicking the Details… button will open the Memory Debugging Settings window.

See section 12.4 Configuration for full details of the available Memory Debugging settings.

5.1.6 Environment variables

The optional Environment Variables section should contain additional environment variables that should be passed to mpirun or its equivalent. These environment variables may also be passed to your program, depending on which MPI implementation your system uses. Most users will not need to use this box.

Note: On some systems it may be necessary to set environment variables for the DDT backend itself. For example: if /tmp is unusable on the compute nodes you may wish to set TMPDIR to a different directory. You can specify such environment variables in /path/to/ddt/lib/environment. Enter one variable per line and separate the variable name and value with =, for example, TMPDIR=/work/user.

5.1.7 Plugins

The optional Plugins section allows you to enable plugins for various third-party libraries, such as the Intel Message Checker or Marmot. See section 13 Using and writing plugins for more information.

Click Run to start your program, or Submit if working through a queue. See section A.2 Integration with queuing systems. This runs your program through the debug interface you selected and allows your MPI implementation to determine which nodes to start which processes on.

Note: If you have a program compiled with Intel ifort or GNU g77 you may not see your code and highlight line when DDT starts. This is because those compilers create a pseudo MAIN function, above the top level of your code. To fix this you can either open your Source Code window and add a breakpoint in your code, then run to that breakpoint, or you can use the Step into function to step into your code.

When your program starts, DDT attempts to determine the MPI world rank of each process. If this fails, the following error message is displayed:


Figure 13: MPI rank error

This means that the number DDT shows for each process may not be the MPI rank of the process. To correct this you can tell DDT to use a variable from your program as the rank for each process.

See section 8.17 Assigning MPI ranks for details.

To end your current debugging session select the End Session menu option from the File menu. This closes all processes and stops any running code. If any processes remain you may have to clean them up manually using the kill command, or a command provided with your MPI implementation.

5.2 Express Launch

Each of the Arm Forge products can be launched by typing its name in front of an existing mpiexec command:

   $ ddt mpiexec -n 128 examples/hello memcrash

This startup method is called Express Launch and is the simplest way to get started. If your MPI is not yet supported in this mode, you will see an error message like this:

   $ 'MPICH 1 standard' programs cannot be started using Express Launch syntax (launching with an mpirun command). 

Try this instead:
ddt --np=256 ./wave_c 20

Type ddt --help for more information.

This is referred to as Compatibility Mode, in which the mpiexec command is not included and the arguments to mpiexec are passed via a --mpiargs="args here" parameter.

One advantage of Express Launch mode is that it is easy to modify existing queue submission scripts to run your program under one of the Arm Forge products. This works best for Arm DDT with Reverse Connect, ddt --connect, for interactive debugging or in offline mode (ddt --offline). See 3.3 Reverse Connect for more details.

If you can not use Reverse Connect and wish to use interactive debugging from a queue you may need to configure DDT to generate job submission scripts for you. More details on this can be found in 5.10 Starting a job in a queue and A.2 Integration with queuing systems.

The following lists the MPI implementations currently supported by Express Launch:

  • bullx MPI
  • Cray X-Series (MPI/SHMEM/CAF)
  • Intel MPI
  • MPICH 2
  • MPICH 3
  • Open MPI (MPI/SHMEM)
  • Oracle MPT
  • Open MPI (Cray XT/XE/XK)
  • Spectrum MPI
  • Spectrum MPI (PMIx)
  • Cray XT/XE/XK (UPC)

5.2.1 Run dialog box

In Express Launch mode, the Run dialog has a restricted number of options:


Figure 14: Express Launch DDT Run dialog box

5.3 remote-exec required by some MPIs

When using SGI MPT, MPICH 1 Standard or the MPMD variants of MPICH 2, MPICH 3 or Intel MPI, DDT will allow mpirun to start all the processes, then attach to them while they're inside MPI_Init.

This method is often faster than the generic method, but requires the remote-exec facility in DDT to be correctly configured if processes are being launched on a remote machine. For more information on remote-exec, please see section A.4 Connecting to remote programs (remote-exec).

Note: If DDT is running in the background (for example, ddt &) then this process may get stuck (some SSH versions cause this behavior when asking for a password). If this happens to you, go to the terminal and use the fg or similar command to make DDT a foreground process, or run DDT again, without using "&".

If DDT cannot find a password-free way to access the cluster nodes then you will not be able to use the specialized startup options. Instead, You can use generic, although startup may be slower for large numbers of processes.

In addition to the listed MPI implementations above, all MPI implementations except for Cray MPT DDT require password-free access to the compute nodes when explicitly starting by attaching.

5.4 Debugging single-process programs


Figure 15: Single-Process Run dialog

Users with single-process licenses will immediately see the Run dialog that is appropriate for single-process applications.

Users with multi-process licenses can uncheck the MPI check box to run a single process program.

Select the application, either by typing the file name in, or selecting using the browser by clicking the browse PIC button. Arguments can be typed into the supplied box.

Click Run to start your program.

Note: If you have a program compiled with Intel ifort or GNU g77 you may not see your code and highlight line when DDTstarts. This is because those compilers create a pseudo MAIN function, above the top level of your code. To fix this you can either open your Source Code window and add a breakpoint in your code and then play to that breakpoint, or you can use the Step Into function to step into your code.

To end your current debugging session select the End Session menu option from the File menu. This will close all processes and stop any running code.

5.5 Debugging OpenMP programs

When running an OpenMP program, set the Number of OpenMP threads value to the number of threads you require. DDT will run your program with the OMP_NUM_THREADS environment variable set to the appropriate value.

There are several important points to keep in mind while debugging OpenMP programs:

  1. Parallel regions created with #pragma omp parallel (C) or !$OMP PARALLEL (Fortran) will usually not be nested in the Parallel Stack View under the function that contained the #pragma. Instead they will appear under a different top-level item. The top-level item is often in the OpenMP runtime code, and the parallel region appears several levels down in the tree.
  2. Some OpenMP libraries only create the threads when the first parallel region is reached. It is possible you may only see one thread at the start of the program.
  3. You cannot step into a parallel region. Instead, check the Step threads together box and use the Run to here command to synchronize the threads at a point inside the region. These controls are discussed in more detail in their own sections of this document.
  4. You cannot step out of a parallel region. Instead, use Run to here to leave it. Most OpenMP libraries work best if you keep the Step threads together box ticked until you have left the parallel region. With the Intel OpenMP library, this means you will see the Stepping Threads window and will have to click Skip All once.
  5. Leave Step threads together off when you are outside a parallel region, as OpenMP worker threads usually do not follow the same program flow as the main thread.
  6. To control threads individually, use the Focus on Thread control. This allows you to step and play one thread without affecting the rest. This is helpful when you want to work through a locking situation or to bring a stray thread back to a common point. The Focus controls are discussed in more detail in their own section of this document.
  7. Shared OpenMP variables may appear twice in the Locals window. This is one of the many unfortunate side-effects of the complex way OpenMP libraries interfere with your code to produce parallelism. One copy of the variable may have a nonsense value, this is usually easy to recognize. The correct values are shown in the Evaluate and Current Line windows.
  8. Parallel regions may be displayed as a new function in the stack views. Many OpenMP libraries implement parallel regions as automatically-generated "outline" functions, and DDT shows you this. To view the value of variables that are not used in the parallel region, you may need to switch to thread 0 and change the stack frame to the function you wrote, rather than the outline function.
  9. Stepping often behaves unexpectedly inside parallel regions. Reduction variables usually require some sort of locking between threads, and may even appear to make the current line jump back to the start of the parallel region. If this happens step over several times and you will see the current line comes back to the correct location.
  10. Some compilers optimize parallel loops regardless of the options you specified on the command line. This has many strange effects, including code that appears to move backwards as well as forwards, and variables that are not displayed or have nonsense values because they have been optimized out by the compiler.
  11. The thread IDs displayed in the Process Group Viewer and Cross-Thread Comparison window will match the value returned by omp_get_thread_num() for each thread, but only if your OpenMP implementation exposes this data to DDT. GCC's support for OpenMP (GOMP) needs to be built with TLS enabled with our thread IDs to match the return omp_get_thread_num(), whereas your system GCC most likely has this option disabled. The same thread IDs will be displayed as tooltips for the threads in the thread viewer, but only your OpenMP implementation exposes this data.

If you are using DDT with OpenMP and would like to tell us about your experiences, please contact Arm support at Arm support, with the subject title OpenMP feedback.

5.6 Manual launching of multi-process non-MPI programs

DDT can only launch MPI programs and scalar (single process) programs itself. The Manual Launch (Advanced) button on the Welcome Page allows you to debug multi-process and multi-executable programs. These programs do not necessarily need to be MPI programs. You can debug programs that use other parallel frameworks, or both the client and the server from a client/server application in the same DDT session.

You must run each program you want to debug manually using the ddt-client command, similar to debugging with a scalar debugger like the GNU debugger (gdb). However, unlike a scalar debugger, you can debug more than one process at the same time in the same DDT session, as long as your license permits it. Each program you run will show up as a new process in the DDT window.

For example to debug both client and server in the same DDT session:

  1. Click the Manual Launch (Advanced) button.
  2. Select 2 processes


    Figure 16: Manual Launch Window
  3. Click the Listen button.
  4. At the command line run:
       ddt-client server & 
    ddt-client client &

The server process appears as process 0 and the client as process 1 in the DDT window.


Figure 17: Manual Launch Process Groups

After you have run the initial programs you may add extra processes to the DDT session, for example extra clients could be added, using ddt-client in the same way.

   ddt-client client2 &

If you check Start debugging after the first process connects you do not need to specify how many processes you want to launch in advance. You can start debugging after the first process connects and add extra processes later as above.

5.7 Debugging MPMD programs

The easiest way to debug MPMD programs is by using Express Launch to start your application.

To use Express Launch, simply prefix your normal MPMD launch line with ddt, for example:

   ddt mpirun -n 1 ./master : -n 2 ./worker

For more information on Express Launch, and compatible MPI implementations, see section 5.2 .

5.7.1 Debugging MPMD programs without Express Launch

If you are using Open MPI, MPICH 2, MPICH 3 or Intel MPI, DDT can be used to debug multiple program, multiple data (MPMD) programs. To start an MPMD program in DDT:

  1. MPICH 2 and Intel MPI only: Select the MPMD variant of the MPI Implementation on the System page of the Options window, for example, for MPICH 2 select MPICH 2 (MPMD).
  2. Click the Run button on the Welcome Page.
  3. Select one of the MPMD programs in the Application box, it does not matter what executable you choose.
  4. Enter the total amount of processes for the MPMD job in the Number of processes box.
  5. Enter an MPMD style command line in the mpirun Arguments box in the MPI section of the Run window, for example:
       -np 4 hello : -np 4 program2


       --app /path/to/my_app_file
  6. Click the Run button.

Note: Ensure that the sum of processes in step 5 is equal to the number of processes set in step 4.

5.7.2 Debugging MPMD programs in Compatibility mode

If you are using Open MPI in Compatibility mode, for example, because you do not have SSH access to the compute nodes, then replace:

   -np 2 ./progc.exe : -np 4 ./progf90.exe

in the mpirun Arguments / appfile with this:

   -np 2 /path/to/ddt/bin/ddt-client ./progc.exe : -np 4 
/path/to/ddt/bin/ddt-client ./progf90.exe

5.8 Opening core files


Figure 18: The Open Core Files Window

DDT allows you to open one or more core files generated by your application.

To debug using core files, click the Open Core Files button on the Welcome Page. This opens the Open Core Files window, which allows you to select an executable and a set of core files. Click OK to open the core files and start debugging them.

While DDT is in this mode, you cannot play, pause or step, because there is no process active. You are, however, able to evaluate expressions and browse the variables and stack frames saved in the core files.

The End Session menu option will return DDT to its normal mode of operation.

5.9 Attaching to running programs

DDT can attach to running processes on any machine you have access to, whether they are from MPI or scalar jobs, even if they have different executables and source pathnames. Clicking the Attach to a Running Program button on the Welcome Page shows DDT's Attach Window:


Figure 19: Attach Window

There are two ways to select the processes you want to attach to: you can either choose from a list of automatically detected MPI jobs (for supported MPI implementations) or manually select from a list of processes.

5.9.1 Automatically detected MPI jobs

DDT can automatically detect MPI jobs started on the local host for selected MPI implementations. This also applies to other hosts you have access to, if an Attach Hosts File is configured. See section A.5.1 System for more details.

The list of detected MPI jobs is shown on the Automatically-detected MPI jobs tab of the Attach Window. Click the header for a particular job to see more information about that job. Once you have found the job you want to attach to simply click the Attach button to attach to it.

Note: Non-MPI programs that were started using MPI may not appear in this window. For example mpirun -np 2 sleep 1000

5.9.2 Attaching to a subset of an MPI job

You may want to attach only to a subset of ranks from your MPI job. You can choose this subset using the Attach to ranks box on the Automatically-detected MPI jobs tab of the Attach Window. You may change the subset later by selecting the File → Change Attached Processes… menu item. The menu item is only available for jobs that were attached to, and not for jobs that were launched using DDT.

5.9.3 Manual process selection

You can manually select which processes to attach to from a list of processes using the List of all processes tab of the Attach Window. If you want to attach to a process on a remote host see section A.4 Connecting to remote programs (remote-exec) first.

Initially the list of processes is blank while DDT scans the nodes, provided in your node list file, for running processes. When all the nodes have been scanned (or have timed out) the window appears as shown above. Use the Filter box to find the processes you want to attach to. On non-Linux platforms you also need to select the application executable you want to attach to. Ensure that the list shows all the processes you wish to debug in your job, and no extra/unnecessary processes. You may modify the list by selecting and removing unwanted processes, or alternatively selecting the processes you wish to attach to and clicking on Attach to Selected Processes. If no processes are selected, DDT uses the whole visible list.

On Linux you may use DDT to attach to multiple processes running different executables. When you select processes with different executables the application box changes to read Multiple applications selected. DDT creates a process group for each distinct executable.

With some supported MPI implementations (for example, Open MPI) DDT shows MPI processes as children of the mpirun (or equivalent) command, as shown in the following figure. Clicking the mpirun command automatically selects all the MPI child processes.


Figure 20: Attaching with Open MPI

Some MPI implementations (such as MPICH 1) create forked (child) processes that are used for communication, but are not part of your job. To avoid displaying and attaching to these, make sure the Hide Forked Children box is ticked. DDT's definition of a forked child is a child process that shares the parent's name. Some MPI implementations create your processes as children of each other. If you cannot see all the processes in your job, try clearing this checkbox and selecting specific processes from the list.

Once you click on the Attach to Selected/Listed Processes button, DDT uses remote-exec to attach a debugger to each process you selected and proceeds to debug your application as if you had started it with DDT. When you end the debug session, DDT detaches from the processes rather than terminating them, this allows you to attach again later if you wish.

DDT examines the processes it attaches to and tries to discover the MPI_COMM_WORLD rank of each process. If you have attached to two MPI programs, or a non-MPI program, then you may see the following message:


Figure 21: MPI rank error

If there is no rank, for example, if you have attached to a non-MPI program, then you can ignore this message and use DDT as normal. If there is, then you can easily tell DDT what the correct rank for each process via the Use as MPI Rank button in the Cross-Process Comparison Window. See section 8.17 Assigning MPI ranks for details.

Note that the stdin, stderr and stdout (standard input, error and output) are not captured by DDT if used in attaching mode. Any input/output continues to work as it did before DDT attached to the program, for example, from the terminal or perhaps from a file.

5.9.4 Configuring attaching to remote hosts

To attach to remote hosts in DDT, click the Choose Hosts button in the attach dialog. This displays the list of hosts to be used for attaching.


Figure 22: Choose Hosts Window

From here you can add and remove hosts, as well as unchecking hosts that you wish to temporarily exclude.

To import a list of hosts from a file, click the Import button.

The hosts list populates using the attach Hosts File. To configure the hosts, use the Options window: File → Options (Arm Forge → Preferences on Mac OS X) .

Each remote host is scanned for processes, and the result is displayed in the attach window. If you have trouble connecting to remote hosts, please see section A.4 Connecting to remote programs (remote-exec).

5.9.5 Using DDT command-line arguments

As an alternative to starting DDT and using the Welcome Page, DDT can instead be instructed to attach to running processes from the command-line.

To do so, you need to specify a list of hostnames and process identifiers (PIDs). If a hostname is omitted then localhost is assumed.

The list of hostnames and PIDs can be given on the command-line using the --attach option:

   mark@holly:∼$ ddt --attach=11057,node5:11352

Another command-line possibility is to specify the list of hostnames and PIDs in a file and use the --attach-file option:

   mark@holly:∼$ cat /home/mark/ddt/examples/hello.list 


mark@holly:∼$ ddt --attach-file=/home/mark/ddt/examples/hello.list

In both cases, if just a number is specified for a hostname:PID pair, then localhost: is assumed.

These command-line options work for both single- and multi-process attaching.

5.10 Starting a job in a queue

In most cases you can debug a job simply by putting ddt --connect in front of the existing mpiexec or equivalent command in your job script. If a GUI is running on the login node or it is connected to it via the remote client, then a message is displayed prompting you with the option to debug the job when it starts.

See 5.2 Express Launch and 3.3 Reverse Connect for more details.

If DDT has been configured to be integrated with a queue/batch environment, as described in section A.2 Integration with queuing systems then you may use DDT to submit your job directly from the GUI. In this case, a Submit button is presented on the Run Window, instead of the ordinary Run button. Clicking Submit from the Run Window will display the queue status until your job starts. DDT will execute the display command every second and show you the standard output. If your queue display is graphical or interactive then you cannot use it here.

If your job does not start or you decide not to run it, click on Cancel Job. If the regular expression you entered for getting the job id is invalid or if an error is reported then DDT will not be able to remove your job from the queue. In this case it is strongly recommended that you check the job has been removed before submitting another as it is possible for a forgotten job to execute on the cluster and either waste resources or interfere with other debug sessions.

Once your job is running, it connects to DDT and you can debug it.

5.11 Job scheduling with jsrun

Launching jobs with jsrun in a job scheduling system enables the topology of processes and threads on the node to be split into individual resource sets (the number of GPUs, CPUs, threads, and MPI tasks). You can specify the amount of computational resource allocated to a resource set.

How you decide to allocate resources has an impact on the runtime of Arm DDT and Arm MAP. For example, it is possible to allocate all of the CPUs on the node to just one resource set. Alternatively, you could allocate each CPU to its own resource set; in this case there are as many resource sets as there are CPUs on the node.

The more resource sets you have on each node, the longer the runtime is for Arm DDT and Arm MAP. To minimize runtime, Arm recommends that you aim to reduce the number of resource sets required.

For example, it is recommended to use:

  jsrun --rs_per_host=1 --gpu_per_rs=0 --cpu_per_rs=42 --tasks_per_rs=42 ...

to launch a job with 42 MPI processes per node in a single resource set, instead of:

jsrun --rs_per_host=42 --gpu_per_rs=0 --cpu_per_rs=1 --tasks_per_rs=1 ...

which launches 42 MPI processes per node, but uses 42 resource sets.

5.12 Using custom MPI scripts

On some systems a custom 'mpirun' replacement is used to start jobs, such as mpiexec. DDT normally uses whatever the default for your MPI implementation is, so for MPICH 1 it would look for mpirun and not mpiexec. This section explains how to configure DDT to use a custom mpirun command for job start up.

There are typically two ways you might want to start jobs using a custom script, and DDT supports them both. Firstly, you might pass all the arguments on the command-line, like this:

   mpiexec -n 4 /home/mark/program/chains.exe /tmp/mydata

There are several key variables in this line that DDT can fill in for you:

  1. The number of processes (4 in the above example).
  2. The name of your program (/home/mark/program/chains.exe).
  3. One or more arguments passed to your program (/tmp/mydata).

Everything else, like the name of the command and the format of its arguments remains constant. To use a command like this in DDT, you adapt the queue submission system described in the previous section. For this mpiexec example, the settings are as shown here:


Figure 23: DDT Using Custom MPI Scripts

As you can see, most of the settings are left blank. There are some differences between the Submit Command in DDT and what you would type at the command-line:

  1. The number of processes is replaced with NUM_PROCS_TAG.
  2. The name of the program is replaced by the full path to ddt-debugger.
  3. The program arguments are replaced by PROGRAM_ARGUMENTS_TAG.

Note, it is not necessary to specify the program name here. DDT takes care of that during its own startup process. The important thing is to make sure your MPI implementation starts ddt-debugger instead of your program, but with the same options.

The second way you might start a job using a custom mpirun replacement is with a settings file:

   mpiexec -config /home/mark/myapp.nodespec

Where myfile.nodespec might contains something similar to the following:

   comp00 comp01 comp02 comp03 : /home/mark/program/chains.exe /tmp/mydata

DDT can automatically generate simple configuration files like this every time you run your program, you need to specify a template file. For the above example, the template file myfile.ddt would contain the following:

   comp00 comp01 comp02 comp03 : DDTPATH_TAG/bin/ddt-debugger DDT_DEBUGGER_ARGUMENTS_TAG PROGRAM_ARGUMENTS_TAG

This follows the same replacement rules described above and in detail in section A.2 Integration with queuing systems. The options settings for this example might be:


Figure 24: DDT Using Substitute MPI Commands

Note the Submit Command and the Submission Template File in particular. DDT will create a new file and append it to the submit command before executing it. In this case what would actually be executed might be mpiexec -config /tmp/ddt-temp-0112 or similar. Therefore, any argument like -config must be last on the line, because DDT will add a file name to the end of the line. Other arguments, if there are any, can come first.

It is recommended that you read the section on queue submission, as there are many features described there that might be useful to you if your system uses a non-standard start up command.

If you do use a non-standard command, please contact Arm support at Arm support.

5.13 Starting DDT from a job script

The usual way of debugging a program with Arm DDT in a queue/batch environment is with Reverse Connect and let it connect back from inside the queue to the GUI. See 3.3 Reverse Connect for more details on Reverse Connect.

To do this replace your usual program invocation with a Arm DDT --connect command such as the following:

   ddt --connect --start MPIEXEC -n NPROCS PROGRAM [ARGUMENTS]

The following could also be used:

   ddt --connect --start --once --np=NPROCS -- PROGRAM [ARGUMENTS]

In these examples MPIEXEC is the MPI launch command, NPROCS is the number of processes to start, PROGRAM is the program to run, and ARGUMENTS are the arguments to the program.

The --once argument tells Arm DDT to exit when the session ends.

The alternative to Reverse Connect for debugging a program in a queue/batch environment is to configure Arm DDT to submit the program to the queue for you. See section 5.10 Starting a job in a queue.

Some users may wish to start Arm DDT itself from a job script that is submitted to the queue/batch environment. To do this:

  1. Configure Arm DDT with the correct MPI implementation.
  2. Disable queue submission in the Arm DDT options.
  3. Create a job script that starts Arm DDT using a command such as:

    Or the following:

       ddt --start --no-queue --once --np=NPROCS -- PROGRAM [ARGUMENTS]

    In these examples MPIEXEC is the MPI launch command, NPROCS is the number of processes to start, PROGRAM is the program to run, and ARGUMENTS are the arguments to the program.

  4. Submit the job script to the queue. The --once argument tells DDT to exit when the session ends.

5.14 Attaching via gdbserver

DDT can attach to debugging sessions that have been started by gdbserver.

This is typically used for debugging embedded devices only. This should be considered as an expert mode and would not normally be used to debug an application running on a server or workstation.

To prepare for using this mode, you must first start a gdbserver on the target device. Please see for further details as invocation may be system dependent.

You may then attach to a running application either via the command line or the user interface.

To attach via the command line use:

  ddt --attach-gdbserver=host:port target-executable

Note: The arguments are not optional.

To attach via the user interface, select the Attach dialog on DDT's welcome page. Select the GDB Server tab and substitute the appropriate settings.

If the gdbserver has been used to launch an application, then it will have been stopped before starting the user code. In this case, add a breakpoint in the main function using the Add Breakpoint button, and then play until this is reached. After this point is reached, source code will be displayed.

5.15 UPC

The DDT configuration depends on the UPC compiler used.

5.15.1 GCC UPC

DDT can debug applications compiled with GCC UPC 4.8 with TLS disabled. See section F.5 GNU.

To run a UPC program in DDT you need to select the MPI implementation "GCC libupc SMP (no TLS)"

5.15.2 Berkeley UPC

To run a Berkeley UPC program in DDT you need to compile the program using -tv flag and then select the same MPI implementation used in the Berkeley compiler build configuration.

The Berkeley compiler must be build using the MPI transport.

See section F.3 Berkeley UPC compiler.

5.16 Numactl

DDT supports launching programs via numactl. DDT supports this feature for MPI programs but has limited support for non-MPI programs.

5.16.1 MPI and SLURM

DDT can attach to MPI programs launched via numactl with or without SLURM. The recommended way to launch via numactl is to use express launch mode (5.2 Express Launch).

   $ ddt mpiexec -n 4 numactl -m 1 ./myMpiProgram.exe 
$ ddt srun -n 4 numactl -m 1 ./myMpiProgram.exe

It is also possible to launch via numactl using compatibility mode (5.1 Running a program). When using compatibility mode, you must specify the full path to numactl in the Application box. You can find the full path by running:

   which numactl

Enter the name of the required application in the Arguments field, after all arguments to be passed to numactl. It is not possible to pass any more arguments to the parallel job runner when using this mode for launching.

Note: When using memory debugging, with a program launched via numactl, the Memory Statistics view will report all memory as 'Default' memory type unless allocated with memkind. See 12.7 Memory Statistics.

5.16.2 Non-MPI Programs

There is a minor caveat to launching non-MPI programs via numactl. If you are using SLURM, set ALLINEA_STOP_AT_MAIN=1, otherwise DDT will not be able to attach to the program. For example, the two following commands are examples of launching non-MPI programs via numactl:

   $ ddt numactl -m 1 ./myNonMpiProgram.exe 
$ ALLINEA_STOP_AT_MAIN=1 ddt srun \
numactl -m 1 ./myNonMpiProgram.exe

Once launched, the program stops in numactl main. To resume debugging as normal, set a breakpoint in your code (optional), then use the play and pause buttons to progress and pause the debugging, respectively.

5.17 Python debugging

5.17.1 Overview

Python debugging in DDT has the following limited support:

  • Debugging Python scripts running under the CPython interpreter (version 2.7, 3.5 and 3.6 only).
  • Decoding the stack to show Python frames, function names and line numbers.
  • Displaying Python local and global variables when a Python frame is selected.
  • Stopping on breakpoints and exceptions in native libraries that were invoked from Python code.
  • Debugging MPI programs written in Python using mpi4py.

This feature is useful when debugging a mixed C, C++, Fortran and Python program that crashes somewhere in native code. If this native code was invoked from a Python function, then you can examine the Python stack and local variables that led to the crash. The feature does not currently support breakpoints, stepping, evaulating Python variables, or the current line window.

5.17.2 Prerequisites

On your system, a debug version of Python must be available to DDT. We suggest doing this by compiling python yourself with debug information and disabling optimisation. Alternatively you can install just the debug symbols however this is dependant entirely on the distribution vendor if the correct information has been compiled into the binary for DDT to be able to operate normally. If you wish to take this approach however one solution is to install the Python debug symbols package. You may need to enable additional debug respositories in your package manager.

On Ubuntu:

   $ sudo apt-get install python2.7-dbg

On Redhat:

   $ sudo yum install python-debug

On SuSE:

   $ sudo zypper install python-base-debug

Python debugging depends on GDB 7.12.1. If GDB 7.6.2 is the selected debugger, you need to change to GDB 7.12.1, using: Go to File → Options → System and set the Debugger field to Automatic (recommended).

5.17.3 Running

To debug Python scripts, start the Python interpreter that will execute the script under DDT. To get line level resolution, rather than function level resolution, insert %allinea_python_debug% before your script when passing arguments to Python. To run the demo in the examples folder, change into the examples folder and run the following steps.

Note: The demo requires mpi4py to be installed.

    $ make -f python.makefile
    $ ../bin/ddt -np 4 python %allinea_python_debug%

    Note: On loading into DDT you will be inside the C code. This is normal as you are debugging the python binary.

  3. Click Run.
  4. Click Play/Continue. If successful, you see a segfault. DDT can help you find the source of this segfault. DDT can help us find the source of this segfault.
  5. To see the Python local variables, open the 'Stacks' view and select a Python frame.

Note: DDT does not search in your PATH when launching executables, so you must specify the full path to Python.

Please be aware DDT loads the correct python debugging information based on executable name. The supported names are python2.7, python3.5 and python3.6, these cannot be symbolic links. Issues with symbolic links usually manifest when setting up a virtual environments as it takes the name you specify for the environment.