Intel Xeon processors starting with Sandy Bridge include Running Average Power Limit (RAPL) counters. Performance Reports can use the RAPL counters to provide energy and power consumption information for your programs.
To enable the RAPL counters to be read by Performance Reports you must load the intel_rapl kernel module.
The intel_rapl module is included in Linux kernel releases 3.13 and later.
For testing purposes Arm have backported the powercap and intel_rapl modules for older kernel releases. You may download the backported modules from:
- CUDA metrics are not available for statically-linked programs.
- CUDA metrics are measured at the node level, not the card level.
There are a number of issues you should be aware of:
- Performance Reports does not support CPU time metrics on this platform. Linux perf event metrics are available instead. To ensure access to performance counters is not restricted, use sysctl -w kernel.perf_event_paranoid=0.
- Performance Reports may fail to finalize a profiling session if the cores are oversubscribed on AArch64 platforms. For example, this issue is likely to occur when attempting to profile a 64 process MPI program on a machine with only 8 cores. This issue will appear as a hang after finishing a profile.