You copied the Doc URL to your clipboard.

Index

Arm DDT
Controlling program execution, 1
Getting started, 2
Getting Support, 3
Installation, 4
Introduction, 5
Logbook, 6
Overview, 7
Program input and output, 8
Running a program, 9
Starting a program
From a job script, 10
Starting, stopping and restarting, 11
Supported platforms, 12
Arm MAP
Cray MPT, 13
Custom metrics, 14
Displaying selected processes, 15
Environment variables, 16
Functions view, 17
Getting started, 18
Getting Support, 19
Installation, 20
Introduction, 21
JSON, 22
Activities, 23
Categories, 24
Example, 25
Metrics, 26
Metrics view, 27
Program output, 28
Project files view, 29
Restricting output, 30
Running from the command line, 31
Saving output, 32
Standard error, 33
Standard output, 34
Starting from job script, 35
Supported platforms, 36
View modes, 37
Main thread only, 38
OpenMP mode, 39
Pthread mode, 40
Viewing totals, 41
Arm IPMI Energy Agent, 42
Requirements, 43
Mac OS X , 44

Accelerator, 45
Align stacks, 46
AMD
OpenCL, 47
Apple
Mac OS X , 48
Application, 49, 50
Arbitrary expressions and global variables, 51
Architecture licensing, 52
Multiple architectures, 53
Arm, 54, 55
Arm (AArch64), 56
Array
Distributed, 57
Expression, 58
Filtering, 59
Multi-dimensional
Viewing, 60
Array data
Viewing, 61
Arrays
Auto Update, 62
Comparing elements across processes, 63
Export, 64
Layout
Data table, 65
Multi-dimensional, 66
Statistics, 67
Visualization, 68
Assembly debugging, 69
Breakpoints, 70
Toggling and viewing, 71
Attaching, 72, 73, 74
Choose hosts, 75
Command line, 76
Configuring
Remote hosts, 77
Hosts file, 78
Attaching to running programs, 79
AUTO_LAUNCH_TAG, 80

Backtrace, 81
Batch schedulers, 82
Berkeley UPC, 83
Bounds checking, 84
Branch instructions, 85
Branch mispredicts, 86, 87
Breakpoints, 88
Conditional, 89
CUDA, 90
Default, 91
Deleting, 92
Focus, 93
Loading, 94
Saving, 95
Setting, 96
Pending, 97
Using source code viewer, 98
Using the Add Breakpoint window, 99
Buffer overflow, 100
Building applications, 101, 102
Bull MPI, 103

C++ STL, 104
C++ STL support, 105
Caliper, 106
Cobalt, 107, 108
Colour Scheme, 109
Compilers
AMD, 110
Cray, 111
GNU, 112
IBM XLC/XLF, 113
Intel, 114
Known issues, 115
OpenCL, 116
Pathscale EKO compilers, 117
Portland Group, 118
Completed instructions, 119, 120
Complex numbers, 121
Configuration, 122, 123
Appearance, 124
Code viewer, 125
Configuration files, 126
Connecting to remote programs, 127
Converting legacy sitewide configuration files, 128
Importing legacy, 129
Job size, 130
Job submission, 131
Optional, 132
Queue commands, 133
Queuing systems, 134
Quick restart, 135
Sitewide, 136
Startup scripts, 137
System, 138
Template script, 139
Template tutorial, 140
Using a shared installation on multiple systems, 141
Using shared home directories on multiple systems, 142
Connecting to a remote system, 143
Consistency checking
Heap, 144
Core Files, 145
Core files, 146
CPU branch, 147
CPU branch mispredictions, 148
CPU floating-point, 149
CPU floating-point vector, 150
CPU FLOPS lower bound, 151
CPU FLOPS vector lower bound, 152
CPU instructions, 153
CPU integer, 154
CPU integer vector, 155
CPU memory access, 156
CPU Memory Accesses, 157
CPU power usage, 158
CPU time, 159
Cray, 160, 161
Compiling scalar programs, 162
Starting scalar programs, 163
Cray ATP, 164
Cray compiler environment, 165
Cray MPT, 166
Cray Native SLURM, 167, 168
Cray X, 169
Cray X-Series, 170, 171, 172, 173, 174, 175
Cray XK6, 176
Cross-process comparison, 177, 178
Cross-thread comparison, 179
CUDA
Breakpoints, 180, 181
Controlling GPU threads, 182
CUDA Fortran, 183
DDT: CUDA, 184
Debugging multiple GPU processes, 185
Examining GPU threads and data, 186
GPU Debugging, 187
GPU device information, 188
IBM XLC/XLF with offloading OpenMP, 189
Launching, 190
Licensing, 191
Memory debugging, 192
NVIDIA, 193
Preparing to debug, 194
Running, 195
Running and pausing, 196
Selecting GPU threads, 197
Source code viewer, 198
Stepping, 199
Thread control, 200
Understanding kernel progress, 201
Viewing GPU thread locations, 202
CUDA profiling, 203
Current line, 204
Custom MPI scripts, 205
Cycles per instruction, 206, 207, 208
Cycles per instruction (Armv8-A), 209

Data
Changing, 210
Deadlock, 211
Debugging
Scalar, 212
Debugging symbols, 213
Detecting leaks, 214
Disassembler, 215
Disk read transfer, 216
Disk write transfer, 217
DP FLOPS, 218
Duration, 219
Dynamic linking
Cray X-Series, 220

Editing source code, 221, 222
End Session, 223
Energy metrics
Requirements, 224
Environment variables, 225, 226
Express Launch, 227, 228
Run dialog box, 229
Expression
Changing language, 230
External Editor, 231

Fencepost checking, 232
Files
Viewing multiple, 233
Find in Files, 234
Floating-point scalar instructions, 235
Floating-point vector instructions, 236
Focus
Breakpoints, 237
Changing, 238
Code viewer, 239
Parallel stack view, 240
Playing, 241
Process group viewer, 242
Step threads together, 243
Stepping, 244
Stepping threads window, 245
Focus control, 246
Font, 247, 248
Fortran intrinsics, 249
Fortran Modules, 250
Function Listing, 251
Functions view, 252

gdbserver, 253
Attaching, 254
General troubleshooting, 255
GNU compiler, 256
GNU UPC, 257
GNU/Linux systems, 258
Go To Line, 259
GPU, 260
Attaching, 261
Device information, 262
GPU Language support, 263
GPU kernels tab, 264
GPU memory usage, 265
GPU power usage, 266
GPU profiling, 267
GPU temperature, 268
GPU utilization, 269

Heap Overflow, 270
HP MPI, 271

I/O, 272
I/O time, 273
IBM XLC/XLF, 274
Inf, 275
Installation, 276
Mac OS X , 277
Linux
Graphical, 278
Text-mode install, 279
Windows, 280
Intel Compiler, 281, 282
Intel compiler, 283
Intel Message Checker, 284
Intel MPI, 285
MPMD, 286
remote-exec, 287
Intel Xeon, 288
Intel Xeon RAPL, 289
Introduction, 290
Involuntary context switches, 291
IPMI, 292

Job ID regular expression, 293
Job scheduling, 294
Job submission, 295, 296
Cancelling, 297, 298
Custom, 299
Regular expression, 300, 301, 302
JSON, 303
Jump To Line
Double clicking, 304

Kernel-mode CPU time, 305
Known issues
Arm Forge times out, 306
MAP adds unexpected overhead, 307
MAP collects very deep stack traces with boost::coroutine, 308
MAP not correctly identifying vectorized instructions, 309
MAP over-reports MPI time, 310
MAP reporting time spent in function definition, 311
MAP specific issues, 312
MAP takes long time to analyze OpenBLAS app, 313
Arm (AArch64), 314
Attaching, 315
Cannot find executable, 316
Cannot find hosts, 317
Compiler, 318
Compiler inlining functions, 319
Controlling a program, 320
DDT stops responding, 321
Program jumps while stepping, 322
Cray, 323
Deadlock callings printf or malloc from a signal handler, 324
Evaluating variables, 325
C++ STL are not pretty printed, 326
Evaluating an array of derived types, 327
Incorrect values printed for Fortran array, 328
Variables cannot be viewed, 329
F1 Help, 330
General, 331
Input/Output, 332
Output to stderr not displayed, 333
Unwind errors, 334
Linking with static MAP sampler library fails, 335, 336
Memory debugging, 337
MPI, 338
MPI wrapper libraries, 339
mprotect fails, 340
No shared home directory, 341
Not enough samples, 342
Only main code visible, 343
Platform, 344
Programs run slowly, 345
Progress bar does not move, 346
Running processes do not show up in the attach window, 347
Source code, 348
No variables or line number information, 349
Source code does not appear, 350
Source code folding does not work, 351
Starting multi-process programs, 352
Starting scalar programs, 353
Starting the GUI, 354
System does not allow debuggers to connect to processes, 355, 356
Tail call optimization, 357
Thread support limitations, 358

L1 cache misses, 359
L2 cache misses, 360
L2 Data cache miss, 361
L2 data cache misses, 362
L3 cache miss per instruction, 363
L3 cache misses, 364
Licensing
Architecture licensing, 365
Multiple architectures, 366
Floating licenses, 367
License files, 368
Single process license, 369
Single-process license, 370
Supercomputing and other floating licenses, 371
Workstation and evaluation licenses, 372
Linking, 373
Dynamic
On Cray X-Series using modules environment, 374
Static, 375
On Cray X-Series using modules environment, 376
Loadleveler, 377, 378
Local variables, 379
Log file, 380
Logbook
Arm DDT Logbook, 381
Annotation, 382
Comparison window, 383
Usage, 384
Lustre file opens, 385
Lustre metadata operations, 386
Lustre read transfer, 387
Lustre write transfer, 388

MAC OS X, 389
Macros, 390
Manual launch
ddt-client, 391
Debugging multi-process non-MPI programs, 392
Manual process selection, 393
map-link modules, 394
Installation
Cray X-Series, 395
Memory debugging, 396, 397
Available checks, 398
Changing settings at run time, 399
Configuration, 400, 401
Cray MPT, 402
Detecting leaks, 403
Enabling, 404
Library usage errors, 405
Memory Statistics, 406
mprotect fails, 407
PMDK, 408
Pointer error detection, 409
Static linking, 410
Suppressing an error, 411
Validity checking, 412
Writing beyond an allocated area, 413
Memory leak, 414
Memory leak report, 415
Memory usage, 416, 417
Message Queues, 418
Message queues, 419
Deadlock, 420
Interpreting, 421
Viewing, 422
Metrics, 423
Accelerator, 424, 425
Branch mispredicts (Armv8-A), 426
Branch mispredicts (Power 9), 427
CPU branch, 428
CPU branch mispredictions, 429
CPU floating-point, 430
CPU floating-point vector, 431
CPU FLOPS lower bound, 432
CPU FLOPS vector lower bound, 433
CPU instructions, 434
CPU integer, 435
CPU integer vector, 436
CPU memory access, 437
CPU Memory Accesses, 438
CPU power usage, 439
CPU time, 440
Cycles per instruction, 441, 442
Cycles per instruction (Armv8-A), 443
Detecting MPI imbalance, 444
Disk read transfer, 445
Disk write transfer, 446
Energy, 447
GPU memory usage, 448
GPU power usage, 449
GPU temperature, 450
GPU utilization, 451
I/O, 452
I/O time, 453
Involuntary context switches, 454
Kernel-mode CPU time, 455
L2 Data cache miss, 456
L3 cache miss per instruction, 457
Lustre, 458
Lustre file opens, 459
Lustre metadata operations, 460
Lustre read transfer, 461
Lustre write transfer, 462
Memory, 463
Memory usage, 464
MPI, 465
MPI call duration, 466
MPI communication and waiting time, 467
MPI point-to-point and collective bytes, 468
MPI point-to-point and collective operations, 469
MPI sent and received, 470
Node memory usage, 471
OpenMP
Multi-threaded computation time, 472
Multi-threaded MPI computation time, 473
Overhead, 474
Thread synchronization time, 475
Time inside an OpenMP region, 476
OpenMP Overhead, 477
POSIX I/O read rate, 478
POSIX I/O write rate, 479
POSIX read syscall rate, 480
POSIX write syscall rate, 481
Single-threaded computation time, 482
Stalled backend cycles, 483, 484
Stalled frontend cycles, 485
System load, 486
System power usage, 487
Time in global memory accesses, 488
User-mode CPU time, 489
Voluntary context switches, 490
Zooming, 491
Metrics view, 492
Mispredicted branch instructions, 493
Moab, 494, 495
MOM nodes, 496
MPC, 497
mpirun, 498
MPI, 499
Distributions, 500
Function Counters, 501
History/Logging, 502
MPI rank, 503
MPI Ranks, 504
mpirun, 505
Running, 506
Troubleshooting, 507
MPI call duration, 508
MPI communication and waiting time, 509
MPI job
Attaching to a subset, 510
Automatic detection, 511
MPI point-to-point and collective bytes, 512
MPI point-to-point and collective operations, 513
MPI sent and received, 514
MPI wrapper libraries, 515
MPI_Init
remote-exec, 516
MPICH, 517
p4, 518
p4 mpd, 519
MPICH 1
remote-exec, 520
MPICH 1 based MPI, 521
MPICH 2, 522
MPMD, 523
remote-exec, 524
MPICH 3, 525
MPMD, 526
remote-exec, 527
mpirun
remote-exec, 528
mpirun_rsh, 529
MPMD
Compatibility mode, 530
Intel MPI, 531
MPICH 2, 532
MPICH 3, 533
remote-exec, 534
Running, 535, 536
MPMD programs
Debugging, 537
Compatibility mode, 538
Without Express Launch, 539
Multi-dimensional array viewer (MDA), 540
Multi-threaded computation time, 541
Multi-threaded MPI computation time, 542
MVAPICH 2, 543

Navigating through source code history, 544
Node memory usage, 545
Numactl
DDT, 546
MAP, 547
Number bases
Viewing, 548
nvcc, 549
Nvidia CUDA, 550
Known issues, 551
NVIDIA Tegra 2, 552

Obtaining Help, 553
Obtaining support, 554
Offline debugging, 555
HTML report, 556
Periodic snapshots, 557
Plain text report, 558
Reading a file for standard input, 559
Run-time job progress reporting, 560
Signal-triggered snapshots, 561
Using, 562
Writing a file from standard output, 563
Offloading OpenMP, 564
Open MPI, 565
MPMD, 566
Compatibility mode, 567
OpenACC, 568
OpenCL, 569
OpenGL, 570, 571
OpenMP, 572
Debugging, 573
Offloading, 574
OMP_NUM_THREADS, 575
Regions, 576
Running, 577, 578
OpenMP overhead, 579
OpenMP Regions view, 580
Oracle Grid Engine, 581, 582

PAPI, 583
Branch instructions, 584
Branch prediction, 585
Cache misses, 586
Completed instructions, 587, 588
Config file, 589
Cycles per instruction, 590
DP FLOPS, 591
Floating-point, 592
Floating-point scalar instructions, 593
Floating-point vector instructions, 594
Install, 595
L1 cache misses, 596
L2 cache misses, 597
L2 data cache misses, 598
L3 cache misses, 599
Metrics, 600
Mispredicted branch instructions, 601
Overview metrics, 602
Vector instructions, 603
Parallel Stack View, 604
Pathscale EKO compilers, 605
PBS, 606, 607
Pending breakpoints, 608
Perf, 609
-target-host, 610
advanced configuration, 611
Command line, 612
Metrics, 613
Probe, 614
Run window, 615
Template file, 616
Viewing, 617
perf_event_paranoid, 618
PGI Accelerators, 619
Platform MPI, 620
Plugins, 621
Enabling, 622
Installing, 623
Reference, 624
Supported, 625
Using, 626
Writing, 627
PMDK, 628
Pointer details, 629, 630
Pointer error detection, 631
Pointers, 632
Portland Group, 633
POSIX I/O read rate, 634
POSIX I/O write rate, 635
POSIX read syscall rate, 636
POSIX write syscall rate, 637
POWER8 and POWER9, 638
Pretty printers, 639
Process details, 640
Process Group Viewer, 641
Process groups, 642
Deleting, 643
Detailed view, 644
Summary view, 645
Processes and cores view, 646
PROCS_PER_NODE_TAG, 647
Profile a Python script, 648
Profiling, 649, 650
Preparing a program, 651
Program part, 652
Programming errors, 653
Python
Running, 654
Python Profiling, 655
Python profiling known issues, 656

Queue submission, 657
Cancelling, 658
Queue submission via Express Launch, 659
Queue template syntax, 660
Environment variables
PROCS_PER_NODE_TAG, 661
Queue template tags, 662
Defining new tags, 663
Environment variable
AUTO_LAUNCH_TAG, 664
Launching, 665
Specifying default options, 666
Using ddt-mpirun, 667

Raw command, 668
Raw Command Window, 669
Rebuilding applications, 670, 671
Receive queue, 672
Registers
Viewing, 673
Remote Client
Installation
Mac OS X , 674
Windows, 675
Remote client, 676
Configuration, 677
Multiple hops, 678
Remote launch, 679
Remote script, 680
Using X forwarding or VNC, 681
remote-exec
Required, 682
Requirements
Energy metrics, 683
Restarting, 684
Reverse Connect, 685
Run-time
Job progress reporting, 686
Running
MPMD, 687, 688
Scalar, 689
Running a program, 690
Running programs
Attaching, 691
Manual process selection, 692

Saving output, 693
Scalar
Debugging, 694
Running, 695
Scalar programs, 696
Search, 697, 698
Selected Lines View, 699
Send queue, 700
Send signal, 701
Sending signals, 702
Session
Saving, 703
Session menu, 704
SGI, 705
SGI MPT
remote-exec, 706
Shared arrays, 707
Signal Handling
Divisions by zero, 708
Floating Point Exception, 709
Segmentation fault, 710
SIGFPE, 711
SIGILL, 712
SIGPIPE, 713
SIGSEGV, 714
SIGUSR1, 715
SIGUSR2, 716
Signal handling, 717
Custom, 718
Sending signals, 719
Single stepping, 720
Single-threaded computation time, 721
SLURM, 722, 723, 724
Slurm
Starting scalar programs, 725
SMP
Performance, 726
Source Code, 727
Source code, 728, 729
Application and external code split, 730
Commiting, 731
Committing, 732
Editing, 733, 734
Find in Files, 735
Missing files, 736
Project files, 737
Rebuilding, 738, 739
Searching, 740, 741
Viewing, 742, 743
Sparkline, 744
Sparklines, 745
Spectrum MPI, 746
Spindle, 747
Stack frame, 748
Stacks table, 749
Stacks view, 750
Stalled backend cycles, 751, 752
Stalled frontend cycles, 753
Standard error, 754
Standard input, 755, 756
Standard output, 757
Starting, 758
Starting MAP, 759
Static analysis, 760
Static checking, 761
Static linking, 762
On Cray X-Series, 763
Step threads together, 764
Stop messages, 765
Stopping, 766
Supported platforms, 767
Arm DDT, 768
Arm MAP, 769
Batch schedulers, 770
Suspending breakpoints, 771
Synchronizing processes, 772
System load, 773
System power usage, 774

Tab size, 775
Tail call optimization, 776
Thread synchronization time, 777
Time in global memory accesses, 778
Time inside an OpenMP region, 779
Time spent on selected lines, 780
TORQUE, 781, 782
Tracepoints, 783
Setting, 784
Tracepoint output, 785

Unexpected queue, 786
Unified Parallel C, 787, 788
Unwind errors, 789
UPC, 790
Berkeley, 791
GNU, 792
User-mode CPU time, 793
Using custom MPI scripts, 794

Validity checking, 795
Variables, 796
Searching, 797, 798
Unused variables, 799
Vector instructions, 800
Version control
Breakpoints and tracepoints, 801
Version control information, 802
Viewing multiple files, 803
Viewing stacks, 804
Overview, 805
Parallel Stack View, 806
Viewing stacks in parallel, 807
Visualize Whitespace, 808
VNC, 809, 810
Voluntary context switches, 811

Warning Symbols, 812
Watchpoints, 813
Welcome Page, 814
Welcome Screen, 815

X forwarding, 816
X11, 817
XK6, 818

Zooming, 819

Was this page helpful? Yes No