You copied the Doc URL to your clipboard.

Index

Arm DDT
Controlling program execution, 1
Getting started, 2
Getting Support, 3
Installation, 4
Introduction, 5
Logbook, 6
Overview, 7
Program input and output, 8
Running a program, 9
Starting a program
From a job script, 10
Starting, stopping and restarting, 11
Supported platforms, 12
Arm MAP
Cray MPT, 13
Custom metrics, 14
Displaying selected processes, 15
Environment variables, 16
Functions view, 17
Getting started, 18
Getting Support, 19
Installation, 20
Introduction, 21
JSON, 22
Activities, 23
Categories, 24
Example, 25
Metrics, 26
Metrics view, 27
Program output, 28
Project files view, 29
Restricting output, 30
Running from the command line, 31
Saving output, 32
Standard error, 33
Standard output, 34
Starting from job script, 35
Supported platforms, 36
View modes, 37
Main thread only, 38
OpenMP mode, 39
Pthread mode, 40
Viewing totals, 41
Arm IPMI Energy Agent, 42
Requirements, 43
Mac OS X , 44

Accelerator, 45
Align stacks, 46
AMD
OpenCL, 47
Apple
Mac OS X , 48
Application, 49, 50
Arbitrary expressions and global variables, 51
Architecture licensing, 52
Multiple architectures, 53
Arm, 54, 55
Arm (AArch64), 56
Array
Distributed, 57
Expression, 58
Filtering, 59
Multi-dimensional
Viewing, 60
Array data
Viewing, 61
Arrays
Auto Update, 62
Comparing elements across processes, 63
Export, 64
Layout
Data table, 65
Multi-dimensional, 66
Statistics, 67
Visualization, 68
Attaching, 69, 70, 71
Choose hosts, 72
Command line, 73
Configuring
Remote hosts, 74
Hosts file, 75
Attaching to running programs, 76
AUTO_LAUNCH_TAG, 77

Backtrace, 78
Batch schedulers, 79
Berkeley UPC, 80
Bounds checking, 81
Branch instructions, 82
Branch mispredicts, 83, 84
Breakpoints, 85
Conditional, 86
CUDA, 87
Default, 88
Deleting, 89
Focus, 90
Loading, 91
Saving, 92
Setting, 93
Pending, 94
Using source code viewer, 95
Using the Add Breakpoint window, 96
Buffer overflow, 97
Building applications, 98, 99
Bull MPI, 100

C++ STL, 101
C++ STL support, 102
Caliper, 103
Cobalt, 104, 105
Colour Scheme, 106
Compilers
AMD, 107
Cray, 108
GNU, 109
IBM XLC/XLF, 110
Intel, 111
Known issues, 112
OpenCL, 113
Pathscale EKO compilers, 114
Portland Group, 115
Completed instructions, 116, 117
Complex numbers, 118
Configuration, 119, 120
Appearance, 121
Code viewer, 122
Configuration files, 123
Connecting to remote programs, 124
Converting legacy sitewide configuration files, 125
Importing legacy, 126
Job size, 127
Job submission, 128
Optional, 129
Queue commands, 130
Queuing systems, 131
Quick restart, 132
Sitewide, 133
Startup scripts, 134
System, 135
Template script, 136
Template tutorial, 137
Using a shared installation on multiple systems, 138
Using shared home directories on multiple systems, 139
Connecting to a remote system, 140
Consistency checking
Heap, 141
Core Files, 142
Core files, 143
CPU branch, 144
CPU branch mispredictions, 145
CPU floating-point, 146
CPU floating-point vector, 147
CPU FLOPS lower bound, 148
CPU FLOPS vector lower bound, 149
CPU instructions, 150
CPU integer, 151
CPU integer vector, 152
CPU memory access, 153
CPU Memory Accesses, 154
CPU power usage, 155
CPU time, 156
Cray, 157, 158
Compiling scalar programs, 159
Starting scalar programs, 160
Cray ATP, 161
Cray compiler environment, 162
Cray MPT, 163
Cray Native SLURM, 164, 165
Cray X, 166
Cray X-Series, 167, 168, 169, 170, 171, 172
Cray XK6, 173
Cross-process comparison, 174, 175
Cross-thread comparison, 176
CUDA
Breakpoints, 177, 178
Controlling GPU threads, 179
CUDA Fortran, 180
DDT: CUDA, 181
Debugging multiple GPU processes, 182
Examining GPU threads and data, 183
GPU Debugging, 184
GPU device information, 185
IBM XLC/XLF with offloading OpenMP, 186
Launching, 187
Licensing, 188
Memory debugging, 189
NVIDIA, 190
Preparing to debug, 191
Running, 192
Running and pausing, 193
Selecting GPU threads, 194
Source code viewer, 195
Stepping, 196
Thread control, 197
Understanding kernel progress, 198
Viewing GPU thread locations, 199
CUDA profiling, 200
Current line, 201
Custom MPI scripts, 202
Cycles per instruction, 203, 204, 205
Cycles per instruction (Armv8-A), 206

Data
Changing, 207
Deadlock, 208
Debugging
Scalar, 209
Debugging symbols, 210
Detecting leaks, 211
Disassembler, 212
Disk read transfer, 213
Disk write transfer, 214
DP FLOPS, 215
Duration, 216
Dynamic linking
Cray X-Series, 217

Editing source code, 218, 219
End Session, 220
Energy metrics
Requirements, 221
Environment variables, 222, 223
Express Launch, 224, 225
Run dialog box, 226
Expression
Changing language, 227
External Editor, 228

Fencepost checking, 229
Files
Viewing multiple, 230
Find in Files, 231
Floating-point scalar instructions, 232
Floating-point vector instructions, 233
Focus
Breakpoints, 234
Changing, 235
Code viewer, 236
Parallel stack view, 237
Playing, 238
Process group viewer, 239
Step threads together, 240
Stepping, 241
Stepping threads window, 242
Focus control, 243
Font, 244, 245
Fortran intrinsics, 246
Fortran Modules, 247
Function Listing, 248
Functions view, 249

gdbserver, 250
Attaching, 251
General troubleshooting, 252
GNU compiler, 253
GNU UPC, 254
GNU/Linux systems, 255
Go To Line, 256
GPU, 257
Attaching, 258
Device information, 259
GPU Language support, 260
GPU kernels tab, 261
GPU memory usage, 262
GPU power usage, 263
GPU profiling, 264
GPU temperature, 265
GPU utilization, 266

Heap Overflow, 267
HP MPI, 268

I/O, 269
I/O time, 270
IBM XLC/XLF, 271
Inf, 272
Installation, 273
Mac OS X , 274
Linux
Graphical, 275
Text-mode install, 276
Windows, 277
Intel Compiler, 278, 279
Intel compiler, 280
Intel Message Checker, 281
Intel MPI, 282
MPMD, 283
remote-exec, 284
Intel Xeon, 285
Intel Xeon RAPL, 286
Introduction, 287
Involuntary context switches, 288
IPMI, 289

Job ID regular expression, 290
Job submission, 291, 292
Cancelling, 293, 294
Custom, 295
Regular expression, 296, 297, 298
JSON, 299
Jump To Line
Double clicking, 300

Kernel-mode CPU time, 301
Known issues
Arm Forge times out, 302
MAP adds unexpected overhead, 303
MAP collects very deep stack traces with boost::coroutine, 304
MAP not correctly identifying vectorized instructions, 305
MAP over-reports MPI time, 306
MAP reporting time spent in function definition, 307
MAP specific issues, 308
MAP takes long time to analyze OpenBLAS app, 309
Arm (AArch64), 310
Attaching, 311
Cannot find executable, 312
Cannot find hosts, 313
Compiler, 314
Compiler inlining functions, 315
Controlling a program, 316
DDT stops responding, 317
Program jumps while stepping, 318
Cray, 319
Deadlock callings printf or malloc from a signal handler, 320
Evaluating variables, 321
C++ STL are not pretty printed, 322
Evaluating an array of derived types, 323
Incorrect values printed for Fortran array, 324
Variables cannot be viewed, 325
F1 Help, 326
General, 327
Input/Output, 328
Output to stderr not displayed, 329
Unwind errors, 330
Linking with static MAP sampler library fails, 331, 332
Memory debugging, 333
MPI, 334
MPI wrapper libraries, 335
mprotect fails, 336
No shared home directory, 337
Not enough samples, 338
Only main code visible, 339
Platform, 340
Programs run slowly, 341
Progress bar does not move, 342
Running processes do not show up in the attach window, 343
Source code, 344
No variables or line number information, 345
Source code does not appear, 346
Source code folding does not work, 347
Starting multi-process programs, 348
Starting scalar programs, 349
Starting the GUI, 350
System does not allow debuggers to connect to processes, 351, 352
Tail call optimization, 353
Thread support limitations, 354

L1 cache misses, 355
L2 cache misses, 356
L2 Data cache miss, 357
L2 data cache misses, 358
L3 cache miss per instruction, 359
L3 cache misses, 360
Licensing
Architecture licensing, 361
Multiple architectures, 362
Floating licenses, 363
License files, 364
Single process license, 365
Single-process license, 366
Supercomputing and other floating licenses, 367
Workstation and evaluation licenses, 368
Linking, 369
Dynamic
On Cray X-Series using modules environment, 370
Static, 371
On Cray X-Series using modules environment, 372
Loadleveler, 373, 374
Local variables, 375
Log file, 376
Logbook
Arm DDT Logbook, 377
Annotation, 378
Comparison window, 379
Usage, 380
Lustre file opens, 381
Lustre metadata operations, 382
Lustre read transfer, 383
Lustre write transfer, 384

MAC OS X, 385
Macros, 386
Manual launch
ddt-client, 387
Debugging multi-process non-MPI programs, 388
Manual process selection, 389
map-link modules, 390
Installation
Cray X-Series, 391
Memory debugging, 392, 393
Available checks, 394
Changing settings at run time, 395
Configuration, 396, 397
Cray MPT, 398
Detecting leaks, 399
Enabling, 400
Library usage errors, 401
Memory Statistics, 402
mprotect fails, 403
PMDK, 404
Pointer error detection, 405
Static linking, 406
Suppressing an error, 407
Validity checking, 408
Writing beyond an allocated area, 409
Memory leak, 410
Memory leak report, 411
Memory usage, 412, 413
Message Queues, 414
Message queues, 415
Deadlock, 416
Interpreting, 417
Viewing, 418
Metrics, 419
Accelerator, 420, 421
Branch mispredicts (Armv8-A), 422
Branch mispredicts (Power 9), 423
CPU branch, 424
CPU branch mispredictions, 425
CPU floating-point, 426
CPU floating-point vector, 427
CPU FLOPS lower bound, 428
CPU FLOPS vector lower bound, 429
CPU instructions, 430
CPU integer, 431
CPU integer vector, 432
CPU memory access, 433
CPU Memory Accesses, 434
CPU power usage, 435
CPU time, 436
Cycles per instruction, 437, 438
Cycles per instruction (Armv8-A), 439
Detecting MPI imbalance, 440
Disk read transfer, 441
Disk write transfer, 442
Energy, 443
GPU memory usage, 444
GPU power usage, 445
GPU temperature, 446
GPU utilization, 447
I/O, 448
I/O time, 449
Involuntary context switches, 450
Kernel-mode CPU time, 451
L2 Data cache miss, 452
L3 cache miss per instruction, 453
Lustre, 454
Lustre file opens, 455
Lustre metadata operations, 456
Lustre read transfer, 457
Lustre write transfer, 458
Memory, 459
Memory usage, 460
MPI, 461
MPI call duration, 462
MPI communication and waiting time, 463
MPI point-to-point and collective bytes, 464
MPI point-to-point and collective operations, 465
MPI sent and received, 466
Node memory usage, 467
OpenMP
Multi-threaded computation time, 468
Multi-threaded MPI computation time, 469
Overhead, 470
Thread synchronization time, 471
Time inside an OpenMP region, 472
OpenMP Overhead, 473
POSIX I/O read rate, 474
POSIX I/O write rate, 475
POSIX read syscall rate, 476
POSIX write syscall rate, 477
Single-threaded computation time, 478
Stalled backend cycles, 479, 480
Stalled frontend cycles, 481
System load, 482
System power usage, 483
Time in global memory accesses, 484
User-mode CPU time, 485
Voluntary context switches, 486
Zooming, 487
Metrics view, 488
Mispredicted branch instructions, 489
Moab, 490, 491
MOM nodes, 492
MPC, 493
mpirun, 494
MPI, 495
Distributions, 496
Function Counters, 497
History/Logging, 498
MPI rank, 499
MPI Ranks, 500
mpirun, 501
Running, 502
Troubleshooting, 503
MPI call duration, 504
MPI communication and waiting time, 505
MPI job
Attaching to a subset, 506
Automatic detection, 507
MPI point-to-point and collective bytes, 508
MPI point-to-point and collective operations, 509
MPI sent and received, 510
MPI wrapper libraries, 511
MPI_Init
remote-exec, 512
MPICH, 513
p4, 514
p4 mpd, 515
MPICH 1
remote-exec, 516
MPICH 1 based MPI, 517
MPICH 2, 518
MPMD, 519
remote-exec, 520
MPICH 3, 521
MPMD, 522
remote-exec, 523
mpirun
remote-exec, 524
mpirun_rsh, 525
MPMD
Compatibility mode, 526
Intel MPI, 527
MPICH 2, 528
MPICH 3, 529
remote-exec, 530
Running, 531, 532
MPMD programs
Debugging, 533
Compatibility mode, 534
Without Express Launch, 535
Multi-dimensional array viewer (MDA), 536
Multi-threaded computation time, 537
Multi-threaded MPI computation time, 538
MVAPICH 2, 539

Navigating through source code history, 540
Node memory usage, 541
Numactl
DDT, 542
MAP, 543
Number bases
Viewing, 544
nvcc, 545
Nvidia CUDA, 546
Known issues, 547
NVIDIA Tegra 2, 548

Obtaining Help, 549
Obtaining support, 550
Offline debugging, 551
HTML report, 552
Periodic snapshots, 553
Plain text report, 554
Reading a file for standard input, 555
Run-time job progress reporting, 556
Signal-triggered snapshots, 557
Using, 558
Writing a file from standard output, 559
Offloading OpenMP, 560
Open MPI, 561
MPMD, 562
Compatibility mode, 563
OpenACC, 564
OpenCL, 565
OpenGL, 566, 567
OpenMP, 568
Debugging, 569
Offloading, 570
OMP_NUM_THREADS, 571
Regions, 572
Running, 573, 574
OpenMP overhead, 575
OpenMP Regions view, 576
Oracle Grid Engine, 577, 578

PAPI, 579
Branch instructions, 580
Branch prediction, 581
Cache misses, 582
Completed instructions, 583, 584
Config file, 585
Cycles per instruction, 586
DP FLOPS, 587
Floating-point, 588
Floating-point scalar instructions, 589
Floating-point vector instructions, 590
Install, 591
L1 cache misses, 592
L2 cache misses, 593
L2 data cache misses, 594
L3 cache misses, 595
Metrics, 596
Mispredicted branch instructions, 597
Overview metrics, 598
Vector instructions, 599
Parallel Stack View, 600
Pathscale EKO compilers, 601
PBS, 602, 603
Pending breakpoints, 604
PGI Accelerators, 605
Platform MPI, 606
Plugins, 607
Enabling, 608
Installing, 609
Reference, 610
Supported, 611
Using, 612
Writing, 613
PMDK, 614
Pointer details, 615, 616
Pointer error detection, 617
Pointers, 618
Portland Group, 619
POSIX I/O read rate, 620
POSIX I/O write rate, 621
POSIX read syscall rate, 622
POSIX write syscall rate, 623
POWER8 and POWER9, 624
Pretty printers, 625
Process details, 626
Process Group Viewer, 627
Process groups, 628
Deleting, 629
Detailed view, 630
Summary view, 631
Processes and cores view, 632
PROCS_PER_NODE_TAG, 633
Profile a Python script, 634
Profiling, 635, 636
Preparing a program, 637
Program part, 638
Programming errors, 639
Python
Running, 640
Python Profiling, 641
Python profiling known issues, 642

Queue submission, 643
Cancelling, 644
Queue submission via Express Launch, 645
Queue template syntax, 646
Environment variables
PROCS_PER_NODE_TAG, 647
Queue template tags, 648
Defining new tags, 649
Environment variable
AUTO_LAUNCH_TAG, 650
Launching, 651
Specifying default options, 652
Using ddt-mpirun, 653

Raw command, 654
Raw Command Window, 655
Rebuilding applications, 656, 657
Receive queue, 658
Registers
Viewing, 659
Remote Client
Installation
Mac OS X , 660
Windows, 661
Remote client, 662
Configuration, 663
Multiple hops, 664
Remote launch, 665
Remote script, 666
Using X forwarding or VNC, 667
remote-exec
Required, 668
Requirements
Energy metrics, 669
Restarting, 670
Reverse Connect, 671
Run-time
Job progress reporting, 672
Running
MPMD, 673, 674
Scalar, 675
Running a program, 676
Running programs
Attaching, 677
Manual process selection, 678

Saving output, 679
Scalar
Debugging, 680
Running, 681
Scalar programs, 682
Search, 683, 684
Selected Lines View, 685
Send queue, 686
Send signal, 687
Sending signals, 688
Session
Saving, 689
Session menu, 690
SGI, 691
SGI MPT
remote-exec, 692
Shared arrays, 693
Signal Handling
Divisions by zero, 694
Floating Point Exception, 695
Segmentation fault, 696
SIGFPE, 697
SIGILL, 698
SIGPIPE, 699
SIGSEGV, 700
SIGUSR1, 701
SIGUSR2, 702
Signal handling, 703
Custom, 704
Sending signals, 705
Single stepping, 706
Single-threaded computation time, 707
SLURM, 708, 709, 710
Slurm
Starting scalar programs, 711
SMP
Performance, 712
Source Code, 713
Source code, 714, 715
Application and external code split, 716
Commiting, 717
Committing, 718
Editing, 719, 720
Find in Files, 721
Missing files, 722
Project files, 723
Rebuilding, 724, 725
Searching, 726, 727
Viewing, 728, 729
Sparkline, 730
Sparklines, 731
Spectrum MPI, 732
Spindle, 733
Stack frame, 734
Stacks table, 735
Stacks view, 736
Stalled backend cycles, 737, 738
Stalled frontend cycles, 739
Standard error, 740
Standard input, 741, 742
Standard output, 743
Starting, 744
Starting MAP, 745
Static analysis, 746
Static checking, 747
Static linking, 748
On Cray X-Series, 749
Step threads together, 750
Stop messages, 751
Stopping, 752
Supported platforms, 753
Arm DDT, 754
Arm MAP, 755
Batch schedulers, 756
Suspending breakpoints, 757
Synchronizing processes, 758
System load, 759
System power usage, 760

Tab size, 761
Tail call optimization, 762
Thread synchronization time, 763
Time in global memory accesses, 764
Time inside an OpenMP region, 765
Time spent on selected lines, 766
TORQUE, 767, 768
Tracepoints, 769
Setting, 770
Tracepoint output, 771

Unexpected queue, 772
Unified Parallel C, 773, 774
Unwind errors, 775
UPC, 776
Berkeley, 777
GNU, 778
User-mode CPU time, 779
Using custom MPI scripts, 780

Validity checking, 781
Variables, 782
Searching, 783, 784
Unused variables, 785
Vector instructions, 786
Version control
Breakpoints and tracepoints, 787
Version control information, 788
Viewing multiple files, 789
Viewing stacks, 790
Overview, 791
Parallel Stack View, 792
Viewing stacks in parallel, 793
Visualize Whitespace, 794
VNC, 795, 796
Voluntary context switches, 797

Warning Symbols, 798
Watchpoints, 799
Welcome Page, 800
Welcome Screen, 801

X forwarding, 802
X11, 803
XK6, 804

Zooming, 805

Was this page helpful? Yes No