You copied the Doc URL to your clipboard.

F Compiler notes and known issues

When compiling for a DDT debugging session, always compile with a minimal amount of optimization, or no optimization. Some compilers reorder instruction execution and omit debug information when compiled with optimization turned on.

F.1 AMD OpenCL compiler

Not supported by MAP.

The AMD OpenCL compiler can produce debuggable OpenCL binaries. However, the target must be the CPU rather than the GPU device. The build flags -g -O0 must be used when building the OpenCL kernel, typically by setting the environment variable:


   AMD_OCL_BUILD_OPTIONS_APPEND="-g␣-O0"

The example codes in the AMD OpenCL toolkit are able to run on the CPU by adding a parameter --device cpu and will result, with the above environment variable set, in debuggable OpenCL.

F.2 Arm Fortran compiler

Debugging of Fortran code may be incomplete or inaccurate. For more information, check the known issues section in the ARM HPC Compiler release notes.

F.3 Berkeley UPC compiler

Not supported by MAP.

The Berkeley UPC compiler is fully supported by Arm DDT, but only when using the MPI conduit (other conduits are not supported).

Warning: If you do not compile the program fixing the number of threads (using the -fupc-threads-<numberOfThreads> flag), a known issue arises at the end of the program execution.

Note

Source files must end with the extension .upc in order for UPC support to be enabled.

F.4 Cray compiler environment

DDT supports Cray Fast Track Debugging, however only certain versions of GDB support it:

  • In DDT 19.0, it is supported in GDB 8.1 and 7.12.1.
  • In DDT 18.2.1, it is supported in GDB 7.12.1 and 7.2.
  • In DDT 5.0, it is only supported when using GDB 7.2, and not when using GDB 7.6.2.

To enable the supported versions of GDB, access the Systems Settings options by selecting File ¿ Options ¿ System (or Options ¿ System, from the Welcome page), then choose from the Debugger options. To enable Fast Track Debugging, compile your program with -Gfast instead of -g.

See the Using Cray Fast-track Debugging section of the Cray Programming Environment User's Guide for more information.

Call-frame information can also be incorrectly recorded, which can sometimes lead to DDT stepping into a function instead of stepping over it. This may also result in time being allocated to incorrect functions in MAP.

C++ pretty printing of the STL is not supported by DDT for the Cray compiler.

Known Issue: If compiling static binaries then linking in the DDT memory debugging library is not straightforward for F90 applications. You will need to do the following:

  1. Manually rerun the compiler command with the -v (verbose) option to get the linker command line. It is assumed that the object files are already created.
  2. Run ld manually to produce the final statically linked executable. For this, the following path modifications will be needed in the previous ld command: Add -L{ddt-path}/lib/64 -ldmalloc immediately prior to where -lc is located. For multi-threaded programs you have to add -ldmallocth -lpthread before the -lc option.

See CUDA/GPU debugging notes for details of Cray OpenMP Accelerator support.

Arm DDT fully supports the Cray UPC compiler. Not supported by MAP.

F.4.1 Compile scalar programs on Cray

To launch scalar code with aprun, using Arm Forge on Cray, your program needs to be linked with Cray PMI. With some configurations of the Cray compiler drivers, Cray PMI is discarded during linking. For static executables, consider using the -Wl,-u,PMI_Init compilation flags to preserve Cray PMI.

If using Arm MAP, see 16.2.2 Linking. If using aprun to launch your program, see H.2.2 Starting scalar programs with aprun. If using SLURM, see H.2.3 Starting scalar programs with srun

F.5 GNU

The compiler flag -fomit-frame-pointer should never be used in an application which you intend to debug or profile. Doing so can mean Arm Forge cannot properly discover your stack frames and you will be unable to see which lines of code your program has stopped at.

For GNU C++, large projects can often result in vast debug information size, which can lead to large memory usage by DDT's back end debuggers. For example, each instance of an STL class used in different object files will result in the compiler generating the same information in each object file.

The -foptimize-sibling-calls optimization (used in -O2, -O3 and -Os) interfere with the detection of some OpenMP regions. If your code is affected with this issue add -fno-optimize-sibling-calls to disable it and allow MAP to detect all the OpenMP regions in your code.

Using the -dwarf-2 flag together with the -strict-dwarf flag may cause problems in stack unwinding, resulting in a "cannot find the frame base" error. DWARF 2 does not provide all the information neceesary for unwinding the call stack, so many compilers add DWARF 3 extensions with the missing information. Using the -strict-dwarf flag prevents compilers from doing so, resulting in the aforementioned message. Removing -strict-dwarf should fix this problem.

F.5.1 GNU UPC

DDT also supports the GCC-UPC compiler (upc_threads_model_process only; the pthread-tls threads model is not supported). MAP does not support this.

To compile and install GCC UPC 4.8 without TLS it is necessary to modify the configuration file path/to/upc/source/code/directory/libgupc/configure, replacing all the entries upc_cv_gcc_tls_supported="yes" to upc_cv_gcc_tls_supported="no".

To run a UPC program in DDT you have to select the MPI implementation "GCC libupc SMP (no TLS)"

F.6 IBM XLC/XLF

It is advisable to use the -qfullpath option to the IBM compilers (XLC/XLF) in order for source files to be found automatically when they are in directories other than that containing the executable. This flag has been known to fail for mpxlf95, and so there may be circumstances when you must right click in the project navigator and add additional paths to scan for source files.

Module data items behave differently between 32 and 64 bit mode, with 32-bit mode generally enabling access to more module variables than 64-bit mode.

Using IBM XL compilers with optimization level -O2 or higher can lead to some partial traces. This occurs because MAP does not have enough information to fully unwind the call stack.

Missing debug information in the binaries produced by XLF can prevent DDT from showing the values in Fortran pointers and allocatable arrays correctly, and assumed-size arrays cannot be shown at all. Please update to the latest compiler version before reporting this to Arm support at Arm support.

Sometimes, when a process is paused inside a system or library call, DDT will be unable to display the stack, or the position of the program in the Code view. To get around this, it is sometimes necessary to select a known line of code and choose Run to here. If this bug affects you, please contact Arm support at Arm support.

For the best OpenMP debug experience, compile your code with -qsmp=omp:noopt instead of -qsmp=omp. For more information about the issues you may encounter when debugging OpenMP, see 5.5 Debugging OpenMP programs.

DDT has been tested against the C compiler xlc version 13.1 and Fortran/Fortran 90 compiler xlf version 15.1 on Linux.

To view Fortran assumed size arrays in DDT you must first right click on the variable, select Edit Type.., and enter the type of the variable with its bounds, for example integer arr(5).

MAP only supports xlc and xlf on Linux.

F.7 Intel compilers

DDT and MAP have been tested with versions 13, 14, 16 and 17.

If you experience problems with missing or incomplete stack traces (for example [partial trace] entries in MAP or no stack traces for allocations in DDT's View Pointer Details window) try recompiling your program with the -fno-omit-frame-pointer argument. The Intel compiler may omit frame pointers by default which can mean Arm Forge cannot properly discover your stack frames and you will be unable to see which lines of code your program has stopped at.

Some optimizations performed when -ax options are specified to IFC/ICC can result in programs which cannot be debugged. This is due to the reuse by the compiler of the frame-pointer, which makes DDT unable to obtain a stack trace.

Some optimizations performed using Interprocedural Optimization (IPO), which is implicitly enabled by the -O3 flag, can interfere with MAP's ability to display call stacks, making it more difficult to understand what the program is doing. To prevent this, it is recommended that IPO be disabled by adding -no-ip -no-ipo to the compiler flags. The -no-ip flag disables IPO within files while -no-ipo disables IPO between files.

The Intel compiler does not always provide enough information to correctly determine the bounds of some Fortran arrays when they are passed as parameters, in particular the lower-bound of assumed-shape arrays.

The Intel OpenMP compiler will always optimize parallel regions, regardless of any -O0 settings. This means that your code may jump around unexpectedly while stepping inside such regions, and that any variables which may have been optimized out by the compiler may be shown with nonsense values. There have also been problems reported in viewing thread-private data structures and arrays. If these affect you, please contact Arm support at Arm support.

Files with a .F or .F90 extension are automatically preprocessed by the Intel compiler. This can also be turned on with the -fpp command-line option. Unfortunately, the Intel compiler does not include the correct location of the source file in the executable produced when preprocessing is used. If your Fortran file does not make use of macros and does not need preprocessing, you can simply rename its extension to .f or .f90 and/or remove the -fpp flag from the compile line instead. Alternatively, you can help DDT discover the source file by right clicking in the Project Files window and then selecting Add/view source directory and adding the correct directory.

Some versions of the compiler emit incorrect debug information for OpenMP programs which may cause some OpenMP variables to show as <not allocated>.

By default Fortran PARAMETERS are not included in the debug information output by the Intel compiler. You can force them to be included by passing the -debug-parameters all option to the compiler.

Known Issue: If compiling static binaries, for example on a Cray XT/XE machine, then linking in the DDT memory debugging library is not straightforward for F90 applications. You need to manually rerun the last ld command (as seen with ifort -v) to include -L{ddt-path}/lib/64-ldmalloc in two locations:

  1. Immediately prior to where -lc is located.
  2. Include the -zmuldefs option at the start of the ld line.

STL sets, maps and multi-maps cannot be fully explored as only the total number of items is displayed. Other data types are unaffected.

To disable pretty printing set the environment variable ALLINEA_DISABLE_PRETTY_PRINTING to 1 before starting DDT. This will enable you to manually inspect the variable in the case of, for example, the incomplete std::set implementations.

F.8 Pathscale EKO compilers

Not supported by MAP.

There are some known issues as shown in the following list:

  • The default Fortran compiler options may not generate enough information for DDT to show where memory was allocated from. View Pointer Details will not show which line of source code memory was allocated from. To enable this, compile and link with the following flags:
    
       -Wl,--export-dynamic -TENV:frame_pointer=ON -funwind-tables
         
         
    
  • For C programs, simply compiling with -g is sufficient.
  • When using the Fortran compiler, you may have to place breakpoints in myfile.i instead of myfile.f90 or myfile.F90. Arm is currently investigating this. Please contact Arm support at Arm support if this applies to your code.
  • Procedure names in modules often have extra information appended to them. This does not otherwise affect the operation of DDT with the Pathscale compiler.
  • The Pathscale 3.1 OpenMP library has an issue which makes it incompatible with programs that call the fork system call on some machines.
  • Some versions of the Pathscale compiler (for example, 3.1) do not emit complete DWARF debugging information for typedef'ed structures. These may show up in DDT with a void type instead of the expected type.
  • Multi-dimensional allocatable arrays can also be given incorrect dimension upper or lower bounds. This has only been reproduced for large arrays, small arrays seem to be unaffected. This has been observed with version 3.2 of the compiler, newer and older versions may also exhibit the same issue.

F.9 Portland Group compilers

DDT has been tested with Portland Tools 9 onwards.

MAP has been tested with version 14 of the PGI compilers. Older versions are not supported as they do not allow line level profiling. Always compile with -Meh_frame to provide sufficient information for profiling.

If you experience problems with missing or incomplete stack traces (that is [partial trace] entries in MAP or no stack traces for allocations in DDT's View Pointer Details window) try recompiling your program with the -Mframe argument. The PGI compiler may omit frame pointers by default which can mean Arm Forge cannot properly discover your stack frames and you will be unable to see which lines of code your program has stopped at.

Some known issues are listed here:

  • Included files in Fortran 90 generate incorrect debug information with respect to file and line information. The information gives line numbers which refer to line numbers from the included file but give the including file as the file.
  • The PGI compiler may emit incorrect line number information for templated C++ functions or omit it entirely. This may cause DDT to show your program on a different line to the one expected, and also mean that breakpoints may not function as expected.
  • The PGI compiler does not emit the correct debugging tags for proper support of inheritance in C++, which prevents viewing of base class members.
  • When using memory debugging with statically linked PGI executables (-Bstatic) because of the in-built ordering of library linkage for F77/F90, you will need to add a localrc file to your PGI installation which defines the correct linkage when using DDT and (static) memory debugging. To your {pgi-path}/bin/localrc append the following:
    
       switch -Bstaticddt is 
    help(Link for DDT memory debugging with static binding)
    helpgroup(linker)
    append(LDARGS=--eh-frame-hdr -z muldefs)
    append(LDARGS=-Bstatic)
    append(LDARGS=-L{DDT-Install-Path}/lib/64)
    set(CRTL=$if(-Bstaticddt,-ldmallocthcxx -lc -lns$(PREFIX)c
    -l$(PREFIX)c, -lc -lns$(PREFIX)c -l$(PREFIX)c))
    set(LC=$if(-Bstaticddt,-ldmallocthcxx -lgcc -lgcc_eh -lc -lgcc
    -lgcc_eh -lc, -lgcc -lc -lgcc));

    pgf90 -help will now list -Bstaticddt as a compilation flag. You should now use that flag for memory debugging with static linking.

    This does not affect the default method of using PGI and memory debugging, which is to use dynamic libraries.

    Note that some versions of ld (notably in SLES 9 and 10) silently ignore the --eh-frame-hdr argument in the above configuration, and a full stack for F90 allocated memory will not be shown in DDT. You can work around this limitation by replacing the system ld, or by including a more recent ld earlier in your path. This does not affect memory debugging in C/C++.

  • When you pass an array splice as an argument to a subroutine that has an assumed shape array argument, the offset of the array splice is currently ignored by DDT. Please contact Arm support at Arm support if this affects you.
  • DDT may show extra symbols for pointers to arrays and some other types. For example if your program uses the variable ialloc2d then the symbol ialloc2d$sd may also be displayed. The extra symbols are added by the compiler and may be ignored.
  • The Portland compiler also wraps F90 allocations in a compiler-handled allocation area, rather than directly using the systems memory allocation libraries directly for each allocate statement. This means that bounds protection (Guard Pages) cannot function correctly with this compiler.
  • DDT passes on all variables that the compiler has told gdb to be in scope for a routine. For the PGI compiler this can include internal variables and variables from Fortran modules even when the only clause has been used to restrict access. DDT is unable to restrict the list to variables actually used in application code.
  • Versions of the PGI compiler prior to 14.9 are unable to compile a static version of the Arm MPI wrapper library, attempting to do so will result in messages such as "Error: symbol 'MPI_F_MPI_IN_PLACE' can not be both weak and common". This is due to a bug in the PGI compiler's weak object support.

    For information concerning the Portland Accelerator model and debugging this with DDT, please see the 14 CUDA GPU debugging of this userguide.

Was this page helpful? Yes No