Trace components

There are many different types of trace components. Most trace components fall into three categories:

  • Trace source - a trace component which generates trace data
  • Trace sink - a trace component which stores or outputs trace data
  • Trace link - a component which links trace or non-trace components together

In this section of the guide, we define the different types of trace components, describe their function, and provide the location of their implementation details. Specifications includes the trace component architecture specification documents.

Note: Debug subsystem design is highly configurable. It is the role of the target designer to create a debug subsystem design that is suitable for the target. This means that your target design might only implement a subset of the components that are described in this section.

Trace source: Embedded Trace Macrocell

Depending on the processor, Armv8 systems have an ETMv4 or an ETMv3.x. For example, Armv8-A systems are ETMv4-only and Armv8-M might have ETMv4 or ETMv3.x.

The Embedded Trace Macrocell (ETM) architectures permit instruction and data trace. As we mentioned in What is trace?, a particular ETM implementation might not include data trace support. For example, the Armv8-A ETM is instruction trace only. Refer to your target documentation to learn whether the implemented ETMs support data tracing.

Because ETM trace data is packetized, data is decompressed and decoded before being analyzed.

Triggers and filters control ETM trace data generation. Filters allow you to control how much trace data is generated. This is useful if:

  • The code or data being traced is large and can cause bandwidth issues on the target
  • There are only certain points where having trace data is useful

Both triggers and filters are typically set using native or self-hosted trace software or an external debugger.

Triggers act like a start point to trace a region of interest. When a trigger is set, the ETM only generates trace data around the trigger point.

Filtering works for instruction and data trace by enabling trace generation between two filter points: a start point and end point. ETMv4 supports richer filters, for example filtering on context ID, Security state, and Exception level, than earlier ETM versions.

Multiple cores or processors can share a single ETM. If multiple cores share an ETM, trace data is generated for only one core at a time. This prevents the user from observing trace data from both cores concurrently.  ETM sharing is not allowed for Armv8-A cores.

An ETM can generate cycle-accurate trace and can insert timestamps into the trace data. Cycle-accurate trace is useful for determining which code or functions are consuming the most execution time. Timestamps are useful when calculating how long a code or function takes to execute and for correlating trace data from different sources.

Note: Including cycle-accurate and timestamping information in the trace data increases the overall trace data size. This might be a problem for systems with limited trace data storage or small off-chip trace port sizes. Consider using triggers and filters to limit the amount of trace data that is generated when using cycle-accurate trace or timestamps.  

ETM implementation details are either described in an ETM section of the Technical Reference Manual (TRM) for a processor, or in a separate TRM for that ETM. First, check whether the TRM for the processor includes an ETM section. If the TRM does not include this section, search for a separate TRM document. Separate TRM document names usually follow the format:

CoreSight ETM-<processor name> Technical Reference Manual

For example, the TRM for the Cortex-R7 ETM is CoreSight ETM-R7 Technical Reference Manual.

Trace source: Program Trace Macrocell

Program Trace Macrocell (PTM) is only found in systems before Armv8. PTMs perform instruction trace only.

Because PTM trace data is packetized, data is decompressed and decoded before being analyzed.

PTM can provide triggering and filtering capabilities. The PTM implementation determines whether these capabilities are present and the number of capabilities that are available. The PTM trigger and filter capabilities work the same way that they work for an ETM.

A PTM can generate cycle-accurate trace and can insert timestamps into the trace data. These features work the same way that they work for an ETM.

PTM implementation details are either described in a PTM section of the TRM for a processor, or in a separate TRM for that PTM. Start by checking whether the TRM for the processor includes a PTM section. If the TRM does not include this section, search for a separate TRM document. Separate TRM document names usually follow the format:

CoreSight PTM-<processor name> Technical Reference Manual

For example, the TRM for the Cortex-A9 PTM is CoreSight PTM-A9 Technical Reference Manual.

Trace source: Instrumentation Trace Macrocell

The Instrumentation Trace Macrocell (ITM) is a low-bandwidth, application-driven trace source. The ITM is mainly used to:

  • Support printf-style debugging
  • Trace OS and application events
  • Output diagnostic system information

The ITM outputs trace data as packets. The four sources for the packets are:

  • Software trace. Software can write directly to the ITM stimulus registers to generate packets.
  • Hardware trace. The debug logic generates these packets, and the ITM outputs them. This is for Cortex-M processors only.
  • Time stamping
  • Global system timestamping. This is for Cortex-M processors only.

The ITM is programmed to control what information is traced.

Cortex-M ITM implementation details are found in the ITM section in the TRM for the Cortex-M processor. For example, the implementation details for the ITM for the Cortex-M4 are in the Instrumentation Trace Macrocell Unit section of the Arm Cortex‑M4 Processor Technical Reference Manual.

General CoreSight ITM implementation details are found in the CoreSight Components Technical Reference Manual.

Trace source: System Trace Macrocell

The System Trace Macrocell (STM) is a trace source that is designed to provide system trace and instrumentation information. This information includes:

  • Memory-mapped writes to the STM Advanced eXtensible Interface (AXI) slave that carry information about the behavior of the software
  • A hardware event interface to signify certain events that are happening in the system

The STM supports timestamps. These timestamps allow correlation with other timestamping trace sources in the CoreSight system, for example instruction trace.

For implementation details, the STM and STM-500 each have their own TRM:

  • CoreSight System Trace Macrocell Technical Reference Manual
  • Arm CoreSight STM-500 System Trace Macrocell Technical Reference Manual
Trace source: Embedded Logic Analyzer

The Embedded Logic Analyzer (ELA) is a CoreSight component that monitors signals within a design. The ELA is most commonly used to monitor bus signals to allow the debug of bus and memory issues. There are two ELA variants:

  • CoreSight ELA-500 Embedded Logic Analyzer
  • CoreSight ELA-600 Embedded Logic Analyzer

ELA-500 and ELA-600 both generate packetized output. You can configure ELA-500 and ELA-600 to store trace data in a dedicated SRAM. You can configure ELA-600 to output trace data onto the AMBA Trace Bus (ATB). The ELA-600 target designer determines whether the trace data is stored in a dedicated SRAM or output onto the ATB.

Trace sink: Trace Memory Controller

The Trace Memory Controller (TMC) captures trace data into local or system memory, or streams trace data to a High-Speed Serial Trace Port (HSSTP). The trace is read by an off-chip external debugger or by on-chip self-hosted debug software.

The implementation details for the Arm CoreSight SoC-600 TMC are available in the Arm CoreSight System-on-Chip SoC-600 Technical Reference Manual. CoreSight SoC-600 implements the Arm Debug Interface Architecture Specification ADIv6.

The implementation details for the Arm CoreSight SoC-400 TMC are available in the CoreSight Trace Memory Controller Technical Reference Manual. CoreSight SoC-400 implements the Arm Debug Interface Architecture Specification ADIv5.

Consult your target designer if are unsure which of these TRMs applies to your target.

The TMC uses one of four configurations that the target designer chooses:

  • Embedded Trace Buffer (ETB)
  • Embedded Trace FIFO (ETF)
  • Embedded Trace Router (ETR)
  • Embedded Trace Streamer (ETS)

Let’s look at these different TMC configurations in more detail:

Embedded Trace Buffer
The Embedded Trace Buffer (ETB) contains a dedicated SRAM that stores generated trace data on-chip for later retrieval. The SRAM acts like a circular buffer that wraps when the buffer size limit is reached. Buffer wrapping works by replacing the oldest trace data with the newest data.

A single ETB can store multiple ETM and PTM trace streams. Normally, the buffer is small, typically from 4KB to 64KB, so the amount of trace data that can be captured is limited. It is usually necessary to use trace source triggering and filtering capabilities to limit the amount of trace data that is captured to ensure that important trace data is not lost due to buffer wrapping.

Embedded Trace FIFO
The Embedded Trace FIFO (ETF) contains a dedicated SRAM that can be used as either a circular buffer, a hardware FIFO, or a software FIFO. In Circular Buffer mode, the ETF has the same functionality as the ETB. In Hardware FIFO mode, ETF is typically used to smooth out fluctuations in the trace data. In Software FIFO mode, on-chip software uses the ETF to read out data over the debug AMBA Peripheral Bus (APB) interface. Configure the ETF mode at runtime.

Embedded Trace Router
With the Embedded Trace Router (ETR), trace can be routed over an AXI interface to the system memory, or to any other AXI slave. An ETR allows larger amounts of trace data to be stored on-chip than an ETB or ETF allows. Like the ETF, the ETR has Circular Buffer and Software FIFO modes. The ETR programmer decides where to store the trace data in memory. Refer to your target documentation or target designer for the best place to store ETR trace data on your target, so that used memory is not overwritten.

Embedded Trace Streamer
The Embedded Trace Streamer (ETS) routes trace data over an AXI4-Stream interface to a streaming device, for example an HSSTP link layer, either directly or through an AXI4-Stream interconnect. The ETS behaves in a similar way to the ETR, by keeping the same baseline functionality. However, the ETS does not include features that are not applicable to trace data streaming, for example incrementing address support.
Trace sink: Trace Port Interface Unit

The Trace Port Interface Unit (TPIU) drives trace data to external pins on a target, so that the Trace Port Analyzer (TPA), which is often part of a debug probe, can capture the trace data. The TPIU:

  • Coordinates the stopping of trace capture when it receives a trigger
  • Inserts source identification information into the trace stream so that trace data can be re-associated with its trace source
  • Outputs the trace data over trace port pins
  • Outputs patterns over the trace port. This pattern output is often referred to as TPIU pattern generation. This allows a TPA to tune its capture logic to the trace port, which maximizes the trace data output frequency on the trace port.

TPIU implementation details are found in either the Arm CoreSight System-on-Chip SoC-600 Technical Reference Manual or the Arm CoreSight SoC-400 Technical Reference Manual.

CoreSight SoC-600 implements ADIv6. CoreSight SoC-400 implements ADIv5. Consult your target designer if are unsure which of these TRMs apply to your target.

Trace link: funnel

The funnel, also called an ATB funnel, merges multiple ATBs into a single ATB. Typically, the single ATB is then routed to a trace component, for example another funnel, an ETB, an ETR, or a TPIU. The funnel comes in programmable or non-programmable configurations. With the programmable configuration, the funnel priority setting is configurable.

Funnel implementation details are found in either the Arm CoreSight System-on-Chip SoC-600 Technical Reference Manual or the Arm CoreSight SoC-400 Technical Reference Manual.

CoreSight SoC-600 implements ADIv6. CoreSight SoC-400 implements ADIv5. Consult your target designer if are unsure which of these TRMs apply to your target.

Trace link: replicator

The replicator, also called an ATB replicator, splits a single ATB into two ATBs. This allows a design to contain multiple trace sinks. Many designs have an ETF or a funnel that is followed by a replicator to route the ATB to various trace sinks like an ETB, TPIU, or ETS. The replicator comes in programmable or non-programmable configurations. With the programmable configuration, the replicator can filter each ATB based on the trace identifier, which is called the trace ID.

Replicator implementation details are found in either the Arm CoreSight System-on-Chip SoC-600 Technical Reference Manual or the Arm CoreSight SoC-400 Technical Reference Manual.

CoreSight SoC-600 implements ADIv6. CoreSight SoC-400 implements ADIv5. Consult your target designer if are unsure which of these TRMs apply to your target.

Trace link: cross trigger network

The cross-trigger network consists of Cross Trigger Interfaces (CTIs) and Cross Trigger Matrices (CTMs). CTIs enable the distribution of events to and from sources and destinations in the system. CTIs are connected to each other using one or more CTMs through channel interfaces.

CTIs are software-configurable, which allows the user to program the trigger to channel and channel to trigger mappings. When a trigger event occurs on a mapped channel, the event is broadcast on that channel to all other CTIs in the system.

In the context of trace, CTIs communicate events between the different trace components and other CoreSight components. For example, the cross-trigger network allows triggers to be routed from trace sources like cores to trace sinks like an ETR.

CTI and CTM implementation details are found in either the Arm CoreSight System-on-Chip SoC-600 Technical Reference Manual or the Arm CoreSight SoC-400 Technical Reference Manual.

CoreSight SoC-600 implements ADIv6. CoreSight SoC-400 implements ADIv5. Consult your target designer if are unsure which of these TRMs apply to your target.

Timestamp generator

The timestamp generator generates 64-bit rolling time for distribution to other CoreSight components, which allows later alignment of trace information. In the Arm implementation, the timestamp generator runs at a constant clock frequency, regardless of the power and clocking state of the processor that uses it. The timestamp generator has two APB interfaces: a read-only interface to read the counter value and management registers, and a programming interface. In many Cortex-A processor systems, the processors and the CoreSight infrastructure use the same source of time.

Timestamp generator implementation details are found in either the Arm CoreSight System-on-Chip SoC-600 Technical Reference Manual or the Arm CoreSight SoC-400 Technical Reference Manual.

CoreSight SoC-600 implements ADIv6. CoreSight SoC-400 implements ADIv5. Consult your target designer if are unsure which of these TRMs apply to your target.

Previous Next