Note: Arm Forge Professional is required to make use of this feature. Please contact Arm Sales at HPCToolsSales@arm.com for details on how to upgrade.
The PAPI metrics are additional metrics available for MAP which use the Performance Application Programming Interface (PAPI). They can be used on any system supported by PAPI.
Note: In this release PAPI metrics will be collected from the main thread only.
Due to the limitations of PAPI, some metrics may be unavailable on your system. MAP displays all available metrics and where metrics are not available error messages are displayed.
As there is a limit on the type and number of events that can be counted together, PAPI metrics have been split up into small groups of compatible events, so that the user can choose which events to view.
To use these metrics, download and install PAPI from http://icl.cs.utk.edu/papi/index.html. Then run the metrics installer papi_install.sh from the Arm Forge directory.
Once installation has completed, edit the PAPI.config file to set your configuration as required.
By default a template PAPI.config file is provided in your installation directory at /arm_installation_directory/map/metrics. Alternatively, the PAPI.config file can be located inside your configuration directory as set by the ALLINEA_CONFIG_DIR environment variable. By default your configuration directory is \$HOME/.allinea.
To use a PAPI.config file located elsewhere, set and export the ALLINEA_PAPI_CONFIG environment variable to point to your PAPI.config file. For example:
This needs to be set before running MAP.
If you are using a queuing system, be sure that the ALLINEA_PAPI_CONFIG variable is set and exported to all the compute nodes, by adding the ALLINEA_PAPI_CONFIG export line to the job script before the MAP command line.
The PAPI config file contains all the metrics sets that can be used and the location of it has been indicated at the end of the installation process. The default metric set is Overview. If you want to use another PAPI metrics set, modify the value of the variable called set to the desired PAPI metrics set of either CacheMisses, BranchPrediction or FloatingPoint.
DP FLOPS: The number of double precision floating-point operations performed per second. This uses the PAPI_DP_OPS (double precision floating-point operations) event. What it actually counts differs across architectures. Additionally, there are many caveats surrounding this PAPI preset on Intel architectures. See http://icl.cs.utk.edu/projects/papi/wiki/PAPITopics:SandyFlops for more details.
L2 data cache misses: The number of L2 data cache misses per second. This uses the PAPI_L2_DCM (L2 data cache misses) event. This metric is only available in this preset if the system has enough hardware counters (5 at least) to collect the required events.
L1 cache misses: The number of L1 cache misses per second. This uses the PAPI_L1_TCM (L1 total cache misses) event, although if this event is unavailable the L1 data cache misses metric (using the PAPI_L1_DCM event) will be displayed instead.
L2 cache misses: The number of L2 cache misses per second. This uses the PAPI_L2_TCM (L2 total cache misses) event, although if this event is unavailable the L2 data cache misses metric (using the PAPI_L2_DCM event) will be displayed instead.
L3 cache misses: The number of L3 cache misses per second. This uses the PAPI_L3_TCM (L3 total cache misses) event, although if this event is unavailable the L3 data cache misses metric (using the PAPI_L3_DCM event) will be displayed instead.
Floating-point vector instructions: The number of vector floating-point instructions per second. This uses the PAPI_VEC_SP (single-precision vectorized instructions) and PAPI_VEC_DP (double-precision vectorized instructions) events, although if those events are unavailable the Vector Instructions metric will be displayed instead.
Vector instructions: The number of vector instructions (floating-point and integer) per second. This uses the PAPI_VEC_INS event, but is only displayed if the events needed for the Floating-point vector instructions metric are not available.
Completed instructions: The number completed instructions per second. This uses the PAPI_TOT_INS event, and is included to provide context for the above other metrics in this group.