SPE functional description
This section describes the functionality of the SPE.
At a high level, SPE behavior consists of:
- Selection of the micro-operation to be profiled.
- Marking the selected micro-operation throughout its lifetime in the core, indicating within the various units that it is to be profiled.
- Storing data about the profiled micro-operation in internal registers during its lifetime in the core.
- Following retire/abort/flush of the profiled instruction, recording the profile data to memory.
While the SPE architecture allows either instructions or micro-operations to be profiled, the core will profile micro-operations in order to minimize the amount of logic necessary to support SPE.
Profiles are collected periodically, with the selection of a micro-operation to be profiled being driven by a simple down-counter which counts the number of speculative micro-operations dispatched, decremented once for each micro-operation. When the counter reaches zero, a micro-operation is identified as being sampled and is profiled throughout its lifetime in the microarchitecture.
The profiling activity is expected to be largely non-intrusive to the core performance, meaning the core's performance should not be meaningfully perturbed while profiling is taking place. Permitted perturbation includes using LS/L2 bandwidth to record the profile data to memory.
The rate of occurrence of this activity depends on the sampling rate, which is user-specified, so it may be possible for the user to specify a sampling rate that is meaningfully intrusive to the core's performance.
- The core's recommended minimum sampling interval is once per 1024 uops.
- This value is also communicated to software via the PMSIDR_EL1 interval bits.
Unlike trace information, SPE profiles are written to memory using a Virtual Address (VA), which means that writes of profiles must have access to the MMU in order to translate a VA to a Physical Address (PA), and must have a means to be written to memory.