You copied the Doc URL to your clipboard.

Index

Arm DDT
Controlling program execution, 1
Getting started, 2
Getting Support, 3
Installation, 4
Introduction, 5
Logbook, 6
Overview, 7
Program input and output, 8
Running a program, 9
Starting a program
From a job script, 10
Starting, stopping and restarting, 11
Supported platforms, 12
Arm MAP
Cray MPT, 13
Custom metrics, 14
Displaying selected processes, 15
Environment variables, 16
Functions view, 17
Getting started, 18
Getting Support, 19
Installation, 20
Introduction, 21
JSON, 22
Activities, 23
Categories, 24
Example, 25
Metrics, 26
Metrics view, 27
Program output, 28
Project files view, 29
Restricting output, 30
Running from the command line, 31
Saving output, 32
Standard error, 33
Standard output, 34
Starting from job script, 35
Supported platforms, 36
View modes, 37
Main thread only, 38
OpenMP mode, 39
Pthread mode, 40
Viewing totals, 41
Arm IPMI Energy Agent, 42
Requirements, 43
Mac OS X , 44

Accelerator, 45
Align stacks, 46
AMD
OpenCL, 47
Apple
Mac OS X , 48
Application, 49, 50
Arbitrary expressions and global variables, 51
Architecture licensing, 52
Multiple architectures, 53
Arm, 54, 55
Arm (AArch64), 56
Array
Distributed, 57
Expression, 58
Filtering, 59
Multi-dimensional
Viewing, 60
Array data
Viewing, 61
Arrays
Auto Update, 62
Comparing elements across processes, 63
Export, 64
Layout
Data table, 65
Multi-dimensional, 66
Statistics, 67
Visualization, 68
Attaching, 69, 70, 71
Choose hosts, 72
Command line, 73
Configuring
Remote hosts, 74
Hosts file, 75
Attaching to running programs, 76
AUTO_LAUNCH_TAG, 77

Backtrace, 78
Batch schedulers, 79
Berkeley UPC, 80
Bounds checking, 81
Branch instructions, 82
Breakpoints, 83
Conditional, 84
CUDA, 85
Default, 86
Deleting, 87
Focus, 88
Loading, 89
Saving, 90
Setting, 91
Pending, 92
Using source code viewer, 93
Using the Add Breakpoint window, 94
Buffer overflow, 95
Building applications, 96, 97
Bull MPI, 98

C++ STL, 99
C++ STL support, 100
Cobalt, 101, 102
Colour Scheme, 103
Compilers
AMD, 104
Cray, 105
GNU, 106
IBM XLC/XLF, 107
Intel, 108
Known issues, 109
OpenCL, 110
Pathscale EKO compilers, 111
Portland Group, 112
Completed instructions, 113, 114
Complex numbers, 115
Configuration, 116, 117
Appearance, 118
Code viewer, 119
Configuration files, 120
Connecting to remote programs, 121
Converting legacy sitewide configuration files, 122
Importing legacy, 123
Job size, 124
Job submission, 125
Optional, 126
Queue commands, 127
Queuing systems, 128
Quick restart, 129
Sitewide, 130
Startup scripts, 131
System, 132
Template script, 133
Template tutorial, 134
Using a shared installation on multiple systems, 135
Using shared home directories on multiple systems, 136
Connecting to a remote system, 137
Consistency checking
Heap, 138
Core Files, 139
Core files, 140
CPU branch, 141
CPU branch mispredictions, 142
CPU cycles, 143
CPU floating-point, 144
CPU floating-point vector, 145
CPU FLOPS lower bound, 146
CPU FLOPS vector lower bound, 147
CPU instructions, 148
CPU integer, 149
CPU integer vector, 150
CPU memory access, 151
CPU Memory Accesses, 152
CPU power usage, 153
CPU time, 154
Cray, 155, 156
Compiling scalar programs, 157
Starting scalar programs, 158
Cray ATP, 159
Cray compiler environment, 160
Cray MPT, 161
Cray Native SLURM, 162, 163
Cray X, 164
Cray X-Series, 165, 166, 167, 168, 169, 170
Cray XK6, 171
Cross-process comparison, 172, 173
Cross-thread comparison, 174
CUDA
Breakpoints, 175, 176
Controlling GPU threads, 177
CUDA Fortran, 178
DDT: CUDA, 179
Debugging multiple GPU processes, 180
Examining GPU threads and data, 181
GPU Debugging, 182
GPU device information, 183
IBM XLC/XLF with offloading OpenMP, 184
Launching, 185
Licensing, 186
Memory debugging, 187
NVIDIA, 188
Preparing to debug, 189
Running, 190
Running and pausing, 191
Selecting GPU threads, 192
Source code viewer, 193
Stepping, 194
Thread control, 195
Understanding kernel progress, 196
Viewing GPU thread locations, 197
CUDA profiling, 198
Current line, 199
Custom MPI scripts, 200
Cycles per instruction, 201, 202

Data
Changing, 203
Deadlock, 204
Debugging
Scalar, 205
Debugging symbols, 206
Detecting leaks, 207
Disassembler, 208
Disk read transfer, 209
Disk write transfer, 210
DP FLOPS, 211
Duration, 212
Dynamic linking
Cray X-Series, 213

Editing source code, 214, 215
End Session, 216
Energy metrics
Requirements, 217
Environment variables, 218, 219
Express Launch, 220, 221
Run dialog box, 222
Expression
Changing language, 223
External Editor, 224

Fencepost checking, 225
Files
Viewing multiple, 226
Find in Files, 227
Floating-point scalar instructions, 228
Floating-point vector instructions, 229
Focus
Breakpoints, 230
Changing, 231
Code viewer, 232
Parallel stack view, 233
Playing, 234
Process group viewer, 235
Step threads together, 236
Stepping, 237
Stepping threads window, 238
Focus control, 239
Font, 240, 241
Fortran intrinsics, 242
Fortran Modules, 243
Function Listing, 244
Functions view, 245

gdbserver, 246
Attaching, 247
General troubleshooting, 248
GNU compiler, 249
GNU UPC, 250
GNU/Linux systems, 251
Go To Line, 252
GPU, 253
Attaching, 254
Device information, 255
GPU Language support, 256
GPU kernels tab, 257
GPU memory usage, 258
GPU power usage, 259
GPU profiling, 260
GPU temperature, 261
GPU utilization, 262

Heap Overflow, 263
HP MPI, 264

I/O, 265
I/O time, 266
IBM XLC/XLF, 267
Inf, 268
Installation, 269
Mac OS X , 270
Linux, 271
Graphical, 272
Text-mode install, 273
Windows, 274
Instructions, 275
Intel Compiler, 276, 277
Intel compiler, 278
Intel Message Checker, 279
Intel MPI, 280
MPMD, 281
remote-exec, 282
Intel Xeon, 283
Intel Xeon RAPL, 284
Introduction, 285
Involuntary context switches, 286
IPMI, 287

Job ID regular expression, 288
Job submission, 289, 290
Cancelling, 291, 292
Custom, 293
Regular expression, 294, 295, 296
JSON, 297
Jump To Line
Double clicking, 298

Kernel-mode CPU time, 299
Known issues
Arm Forge times out, 300
MAP adds unexpected overhead, 301
MAP collects very deep stack traces with boost::coroutine, 302
MAP not correctly identifying vectorized instructions, 303
MAP over-reports MPI time, 304
MAP reporting time spent in function definition, 305
MAP specific issues, 306
MAP takes long time to analyze OpenBLAS app, 307
Arm (AArch64), 308
Attaching, 309
Cannot find executable, 310
Cannot find hosts, 311
Compiler, 312
Compiler inlining functions, 313
Controlling a program, 314
DDT stops responding, 315
Program jumps while stepping, 316
Cray, 317
Deadlock callings printf or malloc from a signal handler, 318
Evaluating variables, 319
C++ STL are not pretty printed, 320
Evaluating an array of derived types, 321
Incorrect values printed for Fortran array, 322
Variables cannot be viewed, 323
F1 Help, 324
General, 325
Input/Output, 326
Output to stderr not displayed, 327
Unwind errors, 328
Linking with static MAP sampler library fails, 329, 330
Memory debugging, 331
MPI, 332
MPI wrapper libraries, 333
mprotect fails, 334
No shared home directory, 335
Not enough samples, 336
Only main code visible, 337
Platform, 338
Programs run slowly, 339
Progress bar does not move, 340
Running processes do not show up in the attach window, 341
Source code, 342
No variables or line number information, 343
Source code does not appear, 344
Source code folding does not work, 345
Starting multi-process programs, 346
Starting scalar programs, 347
Starting the GUI, 348
System does not allow debuggers to connect to processes, 349, 350
Tail call optimization, 351
Thread support limitations, 352

L1 cache misses, 353
L2 cache accesses, 354
L2 cache misses, 355, 356
L2 data cache misses, 357
L3 cache misses, 358
Licensing
Architecture licensing, 359
Multiple architectures, 360
Floating licenses, 361
License files, 362
Single process license, 363
Single-process license, 364
Supercomputing and other floating licenses, 365
Workstation and evaluation licenses, 366
Linking, 367
Dynamic
On Cray X-Series using modules environment, 368
Static, 369
On Cray X-Series using modules environment, 370
Loadleveler, 371, 372
Local variables, 373
Log file, 374
Logbook
Arm DDT Logbook, 375
Annotation, 376
Comparison window, 377
Usage, 378
Lustre file opens, 379
Lustre metadata operations, 380
Lustre read transfer, 381
Lustre write transfer, 382

MAC OS X, 383
Macros, 384
Manual launch
ddt-client, 385
Debugging multi-process non-MPI programs, 386
Manual process selection, 387
map-link modules, 388
Installation
Cray X-Series, 389
Memory debugging, 390, 391
Available checks, 392
Changing settings at run time, 393
Configuration, 394, 395
Cray MPT, 396
Detecting leaks, 397
Enabling, 398
Library usage errors, 399
Memory Statistics, 400
mprotect fails, 401
Pointer error detection, 402
Static linking, 403
Suppressing an error, 404
Validity checking, 405
Writing beyond an allocated area, 406
Memory leak, 407
Memory leak report, 408
Memory usage, 409, 410
Message Queues, 411
Message queues, 412
Deadlock, 413
Interpreting, 414
Viewing, 415
Metrics, 416
Accelerator, 417, 418
CPU branch, 419
CPU branch mispredictions, 420
CPU cycles, 421
CPU floating-point, 422
CPU floating-point vector, 423
CPU FLOPS lower bound, 424
CPU FLOPS vector lower bound, 425
CPU instructions, 426
CPU integer, 427
CPU integer vector, 428
CPU memory access, 429
CPU Memory Accesses, 430
CPU power usage, 431
CPU time, 432
Cycles per instruction, 433
Detecting MPI imbalance, 434
Disk read transfer, 435
Disk write transfer, 436
Energy, 437
GPU memory usage, 438
GPU power usage, 439
GPU temperature, 440
GPU utilization, 441
I/O, 442
I/O time, 443
Instructions, 444
Involuntary context switches, 445
Kernel-mode CPU time, 446
L2 cache accesses, 447
L2 cache misses, 448
Lustre, 449
Lustre file opens, 450
Lustre metadata operations, 451
Lustre read transfer, 452
Lustre write transfer, 453
Memory, 454
Memory usage, 455
MPI, 456
MPI call duration, 457
MPI communication and waiting time, 458
MPI point-to-point and collective bytes, 459
MPI point-to-point and collective operations, 460
MPI sent and received, 461
Node memory usage, 462
Non-stalled cycles, 463
OpenMP
Multi-threaded computation time, 464
Multi-threaded MPI computation time, 465
Overhead, 466
Thread synchronization time, 467
Time inside an OpenMP region, 468
OpenMP Overhead, 469
Perf metrics, 470
POSIX I/O read rate, 471
POSIX I/O write rate, 472
POSIX read syscall rate, 473
POSIX write syscall rate, 474
Single-threaded computation time, 475
Stalled backend cycles, 476
Stalled cycles, 477
Stalled frontend cycles, 478
System load, 479
System power usage, 480
Time in global memory accesses, 481
User-mode CPU time, 482
Voluntary context switches, 483
Zooming, 484
Metrics view, 485
Mispredicted branch instructions, 486
Moab, 487, 488
MOM nodes, 489
MPC, 490
mpirun, 491
MPI, 492
Distributions, 493
Function Counters, 494
History/Logging, 495
MPI rank, 496
MPI Ranks, 497
mpirun, 498
Running, 499
Troubleshooting, 500
MPI call duration, 501
MPI communication and waiting time, 502
MPI job
Attaching to a subset, 503
Automatic detection, 504
MPI point-to-point and collective bytes, 505
MPI point-to-point and collective operations, 506
MPI sent and received, 507
MPI wrapper libraries, 508
MPI_Init
remote-exec, 509
MPICH, 510
p4, 511
p4 mpd, 512
MPICH 1
remote-exec, 513
MPICH 1 based MPI, 514
MPICH 2, 515
MPMD, 516
remote-exec, 517
MPICH 3, 518
MPMD, 519
remote-exec, 520
mpirun
remote-exec, 521
mpirun_rsh, 522
MPMD
Compatibility mode, 523
Intel MPI, 524
MPICH 2, 525
MPICH 3, 526
remote-exec, 527
Running, 528, 529
MPMD programs
Debugging, 530
Compatibility mode, 531
Without Express Launch, 532
Multi-dimensional array viewer (MDA), 533
Multi-threaded computation time, 534
Multi-threaded MPI computation time, 535
MVAPICH 2, 536

Navigating through source code history, 537
Node memory usage, 538
Non-stalled cycles, 539
Numactl
DDT, 540
MAP, 541
Number bases
Viewing, 542
nvcc, 543
Nvidia CUDA, 544
Known issues, 545
NVIDIA Tegra 2, 546

Obtaining Help, 547
Obtaining support, 548
Offline debugging, 549
HTML report, 550
Periodic snapshots, 551
Plain text report, 552
Reading a file for standard input, 553
Run-time job progress reporting, 554
Signal-triggered snapshots, 555
Using, 556
Writing a file from standard output, 557
Offloading OpenMP, 558
Open MPI, 559
MPMD, 560
Compatibility mode, 561
OpenACC, 562
OpenCL, 563
OpenGL, 564, 565
OpenMP, 566
Debugging, 567
Offloading, 568
OMP_NUM_THREADS, 569
Regions, 570
Running, 571, 572
OpenMP overhead, 573
OpenMP Regions view, 574
Oracle Grid Engine, 575, 576

PAPI, 577
Branch instructions, 578
Branch prediction, 579
Cache misses, 580
Completed instructions, 581, 582
Config file, 583
Cycles per instruction, 584
DP FLOPS, 585
Floating-point, 586
Floating-point scalar instructions, 587
Floating-point vector instructions, 588
Install, 589
L1 cache misses, 590
L2 cache misses, 591
L2 data cache misses, 592
L3 cache misses, 593
Metrics, 594
Mispredicted branch instructions, 595
Overview metrics, 596
Vector instructions, 597
Parallel Stack View, 598
Pathscale EKO compilers, 599
PBS, 600, 601
Pending breakpoints, 602
Perf metrics, 603
PGI Accelerators, 604
Platform MPI, 605
Plugins, 606
Enabling, 607
Installing, 608
Reference, 609
Supported, 610
Using, 611
Writing, 612
Pointer details, 613, 614
Pointer error detection, 615
Pointers, 616
Portland Group, 617
POSIX I/O read rate, 618
POSIX I/O write rate, 619
POSIX read syscall rate, 620
POSIX write syscall rate, 621
POWER8 and POWER9, 622
Pretty printers, 623
Process details, 624
Process Group Viewer, 625
Process groups, 626
Deleting, 627
Detailed view, 628
Summary view, 629
Processes and cores view, 630
PROCS_PER_NODE_TAG, 631
Profiling, 632, 633
Preparing a program, 634
Program part, 635
Programming errors, 636
Python
Running, 637
Python Profiling, 638

Queue submission, 639
Cancelling, 640
Queue submission via Express Launch, 641
Queue template syntax, 642
Environment variables
PROCS_PER_NODE_TAG, 643
Queue template tags, 644
Defining new tags, 645
Environment variable
AUTO_LAUNCH_TAG, 646
Launching, 647
Specifying default options, 648
Using ddt-mpirun, 649

Raw command, 650
Raw Command Window, 651
Rebuilding applications, 652, 653
Receive queue, 654
Registers
Viewing, 655
Remote Client
Installation
Mac OS X , 656
Windows, 657
Remote client, 658
Configuration, 659
Multiple hops, 660
Remote launch, 661
Remote script, 662
Using X forwarding or VNC, 663
remote-exec
Required, 664
Requirements
Energy metrics, 665
Restarting, 666
Reverse Connect, 667
Run-time
Job progress reporting, 668
Running
MPMD, 669, 670
Scalar, 671
Running a program, 672
Running programs
Attaching, 673
Manual process selection, 674

Saving output, 675
Scalar
Debugging, 676
Running, 677
Scalar programs, 678
Search, 679, 680
Selected Lines View, 681
Send queue, 682
Send signal, 683
Sending signals, 684
Session
Saving, 685
Session menu, 686
SGI, 687
SGI MPT
remote-exec, 688
Shared arrays, 689
Signal Handling
Divisions by zero, 690
Floating Point Exception, 691
Segmentation fault, 692
SIGFPE, 693
SIGILL, 694
SIGPIPE, 695
SIGSEGV, 696
SIGUSR1, 697
SIGUSR2, 698
Signal handling, 699
Custom, 700
Sending signals, 701
Single stepping, 702
Single-threaded computation time, 703
SLURM, 704, 705, 706
Slurm
Starting scalar programs, 707
SMP
Performance, 708
Source Code, 709
Source code, 710, 711
Application and external code split, 712
Commiting, 713
Committing, 714
Editing, 715, 716
Find in Files, 717
Missing files, 718
Project files, 719
Rebuilding, 720, 721
Searching, 722, 723
Viewing, 724, 725
Sparkline, 726
Sparklines, 727
Spectrum MPI, 728
Stack frame, 729
Stacks table, 730
Stacks view, 731
Stalled backend cycles, 732
Stalled cycles, 733
Stalled frontend cycles, 734
Standard error, 735
Standard input, 736, 737
Standard output, 738
Starting, 739
Starting MAP, 740
Static analysis, 741
Static checking, 742
Static linking, 743
On Cray X-Series, 744
Step threads together, 745
Stop messages, 746
Stopping, 747
Supported platforms, 748
Arm DDT, 749
Arm MAP, 750
Batch schedulers, 751
Suspending breakpoints, 752
Synchronizing processes, 753
System load, 754
System power usage, 755

Tab size, 756
Tail call optimization, 757
Thread synchronization time, 758
Time in global memory accesses, 759
Time inside an OpenMP region, 760
Time spent on selected lines, 761
TORQUE, 762, 763
Tracepoints, 764
Setting, 765
Tracepoint output, 766

Unexpected queue, 767
Unified Parallel C, 768, 769
Unwind errors, 770
UPC, 771
Berkeley, 772
GNU, 773
User-mode CPU time, 774
Using custom MPI scripts, 775

Validity checking, 776
Variables, 777
Searching, 778, 779
Unused variables, 780
Vector instructions, 781
Version control
Breakpoints and tracepoints, 782
Version control information, 783
Viewing multiple files, 784
Viewing stacks, 785
Overview, 786
Parallel Stack View, 787
Viewing stacks in parallel, 788
Visualize Whitespace, 789
VNC, 790, 791
Voluntary context switches, 792

Warning Symbols, 793
Watchpoints, 794
Welcome Page, 795
Welcome Screen, 796

X forwarding, 797
X11, 798
XK6, 799

Zooming, 800

Was this page helpful? Yes No