You copied the Doc URL to your clipboard.

Index

Arm IPMI Energy Agent, 1
Requirements, 2
MAP file, 3
Performance Reports
Specific issues, 4

, 5

Accelerator breakdown, 6
Global memory accesses, 7
GPU utilization, 8
Mean GPU memory usage, 9
Peak GPU memory usage, 10
AMD
OpenCL, 11

Bull MPI, 12

Compatibility Launch, 13
Compute node access, 14
Configuration, 15
CPU breakdown, 16
Memory accesses, 17
OpenMP code, 18
Scalar numeric ops, 19
Single core code, 20
Vector numeric ops, 21
Waiting for accelerators, 22
Cray Compiler Environment, 23
Cray MPT, 24
Cray Native SLURM, 25
Cray X, 26
Cray X-Series, 27, 28, 29, 30, 31, 32
CSV performance reports, 33
Custom DCIM, 34
Custom gmetric, 35

DCIM output, 36
Dynamic linking
Cray X-Series, 37

Enable and disable metrics, 38
Energy breakdown, 39
CPU, 40
Mean node power, 41
Peak node power, 42
System, 43
Energy metrics
Requirements, 44
Example, 45
Compiling, 46
Cray, 47
Generating a performance report, 48
Overview, 49
Running, 50
Express Launch, 51
Compatible MPIs, 52

General Troubleshooting, 53
Generating a report, 54
Getting Support, 55
GNU Compiler, 56

HTML reports, 57

I/O breakdown, 58
Effective process read rate, 59
Effective process write rate, 60
Time in reads, 61
Time in writes, 62
Installation, 63
Linux, 64
Graphical install, 65
Text-mode install, 66
Intel Compiler, 67
Intel MPI, 68
Intel Xeon, 69
RAPL, 70
Interpreting, 71
Introduction, 72
IPMI, 73

Known issues
Performance Reports, 74
Compiler inlining functions, 75
Incorrect MPI time, 76
Insufficient samples, 77
MPI wrapper libraries, 78
No thread activity while blocking on an MPI call, 79
Not correctly identifying vectorized instructions, 80
OpenBLAS application, 81
Reporting time spent in a function definition, 82
Tail Call, 83
Thread support limitations, 84
Compiler, 85
General, 86
No shared home directory, 87
Problems starting multi-process programs, 88
Starting a program, 89
Starting scalar programs, 90

Licensing
Architecture licensing, 91
License files, 92
Supercomputing and other floating licenses, 93
Using multiple architecture licenses, 94
Workstation and evaluation licenses, 95
Linking, 96
Dynamic
On Cray X-Series using modules environment, 97
Static, 98
On Cray X-Series using modules environment, 99
Log file, 100

Map-link modules, 101
Installation
Cray X-Series, 102
Memory breakdown, 103
Mean process memory usage, 104
Peak node memory usage, 105
Peak process memory usage, 106
Metrics
Accelerator breakdown, 107
Computation, 108, 109
Compute, 110
CPU breakdown, 111
Effective process collective rate, 112
Effective process point-to-point rate, 113
Effective process read rate, 114
Effective process write rate, 115
Energy
Accelerator, 116
CPU, 117
Mean node power, 118
Peak node power, 119
System, 120
Energy breakdown, 121
Global memory accesses, 122
GPU Utilization, 123
I/O breakdown, 124
Input/Output, 125
Mean GPU memory usage, 126
Mean process memory usage, 127
Memory accesses, 128
Memory breakdown, 129
MPI, 130
MPI breakdown, 131
OpenMP breakdown, 132
OpenMP code, 133
Peak GPU memory usage, 134
Peak node memory usage, 135
Peak process memory usage, 136
Physical core utilization, 137, 138
Scalar numeric ops, 139
Single core code, 140
Synchronization, 141, 142
System load, 143, 144
Threads breakdown, 145
Time in collective calls, 146
Time in point-to-point calls, 147
Time in reads, 148
Time in writes, 149
Vector numeric ops, 150
Waiting for accelerators, 151
MPI
Troubleshooting, 152
MPI breakdown, 153
Effective process collective rate, 154
Effective process point-to-point rate, 155
Time in collective calls, 156
Time in point-to-point calls, 157
MPI wrapper libraries, 158
MPICH 2, 159
MPICH 3, 160

NVIDIA CUDA, 161

Obtaining support, 162
Online resources, 163
Open MPI, 164
OpenMP breakdown, 165
Computation, 166
Physical core utilization, 167
Synchronization, 168
System load, 169
Output locations, 170

Performance reports
Energy breakdown
Accelerator, 171
Threads breakdown
Synchronization, 172
Platform MPI, 173
Portland Group Compiler, 174
Profiling
Preparing a program, 175

Report summary, 176
Compute, 177
Input/Output, 178
MPI, 179
Requirements
Energy metrics, 180
Running, 181

SGI, 182
SLURM, 183
Spectrum MPI, 184
Static linking, 185
On Cray X-Series, 186
Supported Platforms, 187

Textual performance reports, 188
Thread support limitations, 189
Threads breakdown, 190
Computation, 191
Physical core utilization, 192
System load, 193

Unified Parallel C, 194, 195
UPC
Berkeley, 196
GNU, 197

Worked examples, 198
Code characterization and run size comparison, 199
Deeper CPU metric analysis, 200
I/O performance bottlenecks, 201

Was this page helpful? Yes No