Number of instructions executed by the processor in a given time interval
Article ID: 103489824
Published date: 13 Feb 2018
Last updated: -
Applies to: Cortex-M3, Cortex-M4
How can I count the number of instructions executed by the processor in a given time interval?
The processor contains an optional DWT unit which provides a number of cycle counters.
The basic cycle counter DWT_CYCCNT increments on each clock cycle when the processor is not halted in debug state.
A variety of performance monitor counters are provided, which count the number of clock cycles during which the processor diverges from its usual behavior of executing one instruction per cycle. Most of these performance monitors account for cycles where no additional instruction is executed for one of a number of reasons:
DWT_CPICNT - additional cycles required to execute multi-cycle instructions and instruction fetch stalls
DWT_EXCCNT - cycles spent performing exception entry and exit procedures
DWT_SLEEPCNT - cycles spent sleeping
DWT_LSUCNT - cycles spent waiting for loads and stores to complete
There is also a performance monitor for cycles saved by "folded" instructions:
DWT_FOLDCNT - cycles saved by instructions which execute in zero cycles
So if the processor includes the DWT profiling counters, the instruction count can be calculated as:
# instructions = CYCCNT - CPICNT - EXCCNT - SLEEPCNT - LSUCNT + FOLDCNT
This result is architecturally defined to be approximate. See the section "Profiling counter accuracy" in the ARMv7-M Architecture Reference Manual for details.
For a finished, packaged chip, if the chip includes an ETM module for instruction trace, a debugger connected to the trace port output should be able to count instructions exactly, because every instruction is reported in the streaming trace exported on the trace port. However, depending on the processor clock speed and the trace channel bandwidth, it is possible that there may be intermittent gaps in the trace stream due to trace channel capacity overload.
For chip designers who are running a logic simulation of the chip design using the RTL description of the processor, or using the Design Simulation Model (DSM), the exact instruction count can be observed by enabling the "tarmac" logging feature. This feature generates a text log file history of the processor activity during the simulation run. Designers can also enable the ETM interface of the processor (whether or not the ETM option is implemented in the design) and count the cycles where the ETMIVALID signal is asserted High.