Performance Monitoring Unit
The MMU-600 includes a PMU for the TCU and a PMU for each TBU. The PMU events and counters indicate the runtime performance of the MMU-600.
The MMU-600 includes logic to gather various statistics on the operation of the MMU during runtime, using events and counters. These events, which the SMMUv3 architecture defines, provide useful information about the behavior of the MMU. You can use this information when debugging or profiling traffic.
SMMUv3 architectural performance events
Both the TCU and the TBU implement performance events that the SMMUv3 Performance Monitor extension defines.
The SMMU_PMCG_SMR0 register can filter some events so that only events with a particular StreamID are counted. This event filtering includes:
- Speculative transactions and translations.
- Transactions and translations that result in a terminated transaction or a translation fault.
The following table shows the architecturally defined MMU-600 TCU performance events.
Table 2-5 SMMUv3 performance events for the TCU
Event |
Event ID |
SMMU_PMCG_SMR0 filterable |
Description |
---|---|---|---|
Clock cycle. |
|
No |
Counts clock cycles. Cycles where the clock is gated after a clock Q-Channel handshake are not counted. |
Transaction. |
|
Yes |
Counts translation requests that originate from a DTI-TBU or DTI-ATS master. |
TLB miss caused by incoming transaction or translation request. |
|
Yes |
Counts translation requests where the translation walks new translation table entries. |
Configuration cache miss caused by transaction or translation request. |
|
Yes |
Counts translation requests where the translation walks new configuration table entries. |
Translation table walk access. |
|
Yes |
Counts translation table walk accesses. |
Configuration structure access. |
|
Yes |
Counts configuration table walk accesses. |
PCIe ATS Translation Request received. |
|
Yes |
Counts translation requests that originate from a DTI-ATS master. |
The following table shows the architecturally defined MMU-600 TBU performance events.
Table 2-6 SMMUv3 performance events for the TBU
Event |
Event ID |
SMMU_PMCG_SMR0 filterable |
Description |
---|---|---|---|
Clock cycle. |
|
No |
Counts clock cycles. Cycles where the clock is gated after a clock Q-Channel handshake are not counted. |
Transaction. |
|
Yes |
Counts transactions that are issued on the TBM interface. |
TLB miss caused by incoming transaction or translation request. |
|
Yes |
Counts non-speculative translation requests that are issued to the TCU. |
PCIe ATS Translation Request received. |
|
Yes |
Counts ATS-translated transactions that are issued on the TBM interface. |
See the Arm® System Memory Management Unit Architecture Specification, SMMU architecture version 3.0 and version 3.1 for more information.
MMU-600 TCU events
The MMU-600 PMU can be configured to monitor a range of implementation defined TCU performance events.
The SMMU_PMCG_SMR0 register can filter some TCU performance events so that only events with a particular StreamID are counted. This event filtering includes:
- Speculative transactions and translations.
- Transactions and translations that result in a terminated transaction or a translation fault.
The following table shows the TCU performance events.
Table 2-7 MMU-600 TCU performance events
Event |
Event ID |
SMMU_PMCG_SMR0 filterable |
Description |
---|---|---|---|
S1L0WC lookup |
|
Yes |
Counts translation requests that access the S1L0WC walk cache. |
S1L0WC miss |
|
Yes |
Counts translation requests that access the S1L0WC walk cache and do not result in a hit. |
S1L1WC lookup |
|
Yes |
Counts translation requests that access the S1L1WC walk cache. |
S1L1WC miss |
|
Yes |
Counts translation requests that access the S1L1WC walk cache and do not result in a hit. |
S1L2WC lookup |
|
Yes |
Counts translation requests that access the S1L2WC walk cache. |
S1L2WC miss |
|
Yes |
Counts translation requests that access the S1L2WC walk cache and do not result in a hit. |
S1L3WC lookup |
|
Yes |
Counts translation requests that access the S1L3WC walk cache. |
S1L3WC miss |
|
Yes |
Counts translation requests that access the S1L3WC walk cache and do not result in a hit. |
S2L0WC lookup |
|
Yes |
Counts translation requests that access the S2L0WC walk cache. |
S2L0WC miss |
|
Yes |
Counts translation requests that access the S2L0WC walk cache and do not result in a hit. |
S2L1WC lookup |
|
Yes |
Counts translation requests that access the S2L1WC walk cache. |
S2L1WC miss |
|
Yes |
Counts translation requests that access the S2L1WC walk cache and do not result in a hit. |
S2L2WC lookup |
|
Yes |
Counts translation requests that access the S2L2WC walk cache. |
S2L2WC miss |
|
Yes |
Counts translation requests that access the S2L2WC walk cache and do not result in a hit. |
S2L3WC lookup |
|
Yes |
Counts translation requests that access the S2L3WC walk cache. |
S2L3WC miss |
|
Yes |
Counts translation requests that access the S2L3WC walk cache and do not result in a hit. |
WC read |
|
Yes |
Counts reads from the walk cache RAMs, excluding reads that are caused by invalidation requests. NoteA single walk cache lookup might result in multiple RAM reads. This behavior permits contiguous entries to be located. |
Buffered translation |
|
Yes |
Counts translations written to the translation request buffer because all translation slots are full. |
CC lookup |
|
Yes |
Counts lookups into the configuration cache. |
CC read |
|
Yes |
Counts reads from the configuration cache RAMs, excluding reads that are caused by invalidation requests. NoteA single cache lookup might result in multiple RAM reads. This behavior permits contiguous entries to be located. |
CC miss |
|
Yes |
Counts lookups into the configuration cache that result in a miss. |
Speculative translation |
|
Yes |
Counts translation requests that are marked as speculative. |
S1L0WC error |
|
No |
RAS corrected error in S1L0 walk cache. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
S1L1WC error |
|
No |
RAS corrected error in S1L1 walk cache. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
S1L2WC error |
|
No |
RAS corrected error in S1L2 walk cache. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
S1L3WC error |
|
No |
RAS corrected error in S1L3 walk cache. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
S2L0WC error |
|
No |
RAS corrected error in S2L0 walk cache. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
S2L1WC error |
|
No |
RAS corrected error in S2L1 walk cache. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
S2L2WC error |
|
No |
RAS corrected error in S2L2 walk cache. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
S2L3WC error |
|
No |
RAS corrected error in S2L3 walk cache. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
Configuration cache error |
|
No |
RAS corrected error in configuration cache. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
Note
A single DTI translation request might correspond to multiple translation request events in either of the following circumstances:
- A translation results in a stall fault event and is restarted.
- If a translation results in a stall fault event because of the Event queue being full, the translation is retried when an Event queue slot becomes available.
MMU-600 TBU events
The MMU-600 PMU can be configured to monitor a range of implementation defined TBU performance events.
The SMMU_PMCG_SMR0 register can filter the TBU performance events so that only events with a particular StreamID are counted. This event filtering includes:
- Speculative transactions and translations.
- Transactions and translations that result in a terminated transaction or a translation fault.
The following table shows the TBU performance events.
Table 2-8 MMU-600 TBU performance events
Event |
Event ID |
SMMU_PMCG_SMR0 filterable |
Description |
---|---|---|---|
Main TLB lookup |
|
Yes |
Counts Main TLB lookups. |
Main TLB miss |
|
Yes |
Counts translation requests that miss in the Main TLB. |
Main TLB read |
|
Yes |
Counts once per access to the Main TLB RAMs, excluding reads that invalidation requests cause. NoteA transaction might access the Main TLB multiple times to look for different page sizes. |
Micro TLB lookup |
|
Yes |
Counts micro TLB lookups. |
Micro TLB miss |
|
Yes |
Counts translation requests that miss in the micro TLB. |
Slots full |
|
No |
Counts once per cycle when all slots are occupied and not ready to issue transactions downstream. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
Out of translation tokens |
|
No |
Counts once per cycle when a translation request cannot be issued because all translation tokens are in use. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
Write data buffer full |
|
No |
Counts once per cycle when a transaction is blocked because the write data buffer is full. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
Translation request |
|
Yes |
Counts translation requests, including both speculative and non-speculative requests. |
Write data uses write data buffer |
|
Yes |
Counts transactions with write data that is stored in the write data buffer. |
Write data bypasses write data buffer |
|
Yes |
Counts transactions with write data that bypasses the write data buffer. |
MakeInvalid downgrade |
|
Yes |
Counts when either:
|
Stash fail |
|
Yes |
Counts when either.
NoteA StashOnceShared or StashOnceUnique transaction that is terminated because of a StreamDisable or GlobalDisable translation response does not cause this event to count. |
Main TLB error |
|
No |
RAS corrected error in Main TLB. This Secure event is visible only when the SMMU_PMCG_SCR.SO bit is set to 1. |
SMMUv3 PMU register architectural options
The SMMUv3 architecture defines the Performance Monitor Counter Group (PMCG) configuration register, SMMU_PMCG_CFGR. An MMU-600 implementation assumes fixed values for SMMU_PMCG_CFGR, and these values define behavioral aspects of the implementation.
The following table shows the SMMU_PMCG_CFGR register options that the MMU-600 TCU and TBU use.
Table 2-9 MMU-600 SMMU_PMCG_CFGR register architectural options
Field |
Default value |
Description for default value |
---|---|---|
SID_FILTER_TYPE |
1 |
A single StreamID filter applies to all PMCG counters. |
CAPTURE |
1 |
Capture of counter values into SVRn registers is supported. |
MSI |
0 |
The counter group does not support Message Signaled Interrupts (MSIs). |
RELOC_CTRS |
1 |
The PMCG registers are relocated to page 1 of the PMU address map. |
SIZE |
|
The counter group implements 32-bit counters. |
NCTR |
|
The counter group includes 4 counters. |
Related information
PMU snapshot interface
The Performance Monitoring Unit (PMU) snapshot interface is included on the TCU and on each TBU. You can use this asynchronous interface to initiate a PMU snapshot. A simultaneous snapshot of each counter register is created and copied to the respective SMMU_PMCG_SVRn register.
The PMU snapshot sequence is a 4-phase handshake. Both pmusnapshot_req and pmusnapshot_ack are LOW after reset. A snapshot occurs on the rising edge of pmusnapshot_req, and is equivalent to writing the value 1 to SMMU_PMCG_CAPR.CAPTURE.
The pmusnapshot_req signal is sampled using synchronizing registers. A register drives pmusnapshot_ack so that the connected component can sample the signal asynchronously.