System architecture

So far in this guide, we have concentrated on the processor, but TrustZone is much more than just a set of processor features. To take advantage of the TrustZone features, we need support in the rest of the system as well.

Here is an example of a TrustZone-enabled system:

This section explores the key components in this system and their role in TrustZone.

Slave devices: peripherals, and memories

Earlier in TrustZone in the processor we introduced the idea of two physical address spaces, Secure and Non-secure. The processor exports the address space that is being accessed to the memory system. The memory system uses this information to enforce the isolation.

In this topic, we refer to bus Secure and bus Non-secure. Bus Secure means a bus access to the Secure physical address space. Bus Non-secure means a bus access to the Non-secure physical address space. Remember that in Secure state software can access both physical address spaces. This means that the security of the bus access is not necessarily the same as the Security state of the processor that generated that access.

Note: In AMBA AXI and ACE, the AxPROT[1] signal is used to specify which address space is being accessed. Like with the NS bit in the translation tables, 0 indicates Secure and 1 indicates Non-secure.

In theory, a system could have two entirely separate memory systems, using the accessed physical address space (AxPROT) to select between them. In practice this is unlikely. Instead, systems use the physical address space like an attribute, controlling access to different devices in the memory system.

In general, we can talk about two types of slave devices:

  • TrustZone aware
    This is a device that is built with some knowledge of TrustZone and uses the security of the access internally.
    An example is the Generic Interrupt Controller (GIC). The GIC is accessed by software in both Secure and Non-secure state. Non-secure accesses are only able to see Non-secure interrupts. Secure accesses can see all interrupts. The GIC implements uses the security of the bus transaction to determine which view to present.

  • Non-TrustZone aware
    This represents most slaves in a typical system. The device does not use the security of the bus access internally.
    An example is a simple peripheral like a timer, or an on-chip memory. Each would be either Secure or Non-secure, but not both.

Enforcing isolation

TrustZone is sometimes referred to as a slave-enforced protection system. The master signals the security of its access and the memory system decides whether to allow the access. How is the memory system-based checking done?

In most modern systems, the memory system-based checking is done by the interconnect. For example, the Arm NIC-400 allows system designers to specify for each connected slave:

  • Secure
    Only Secure accesses are passed to device. Interconnect generates a fault for all Non-secure accesses, without the access being presented to the device.

  • Non-secure
    Only Non-secure accesses are passed to device. Interconnect generates a fault for all Secure accesses, without the access being presented to the device. 

  • Boot time configurable
    At boot time, system initialization software can program the device as Secure or Non-secure.
    The default is Secure.

  • TrustZone aware
    The interconnect allows all accesses through. The connected device must implement isolation.

For example:

This approach works well for either TrustZone-aware devices or those devices that live entirely within one address space. For larger memories, like off-chip DDR, we might want to partition the memory into Secure and Non-secure regions. A TrustZone Address Space Controller (TZASC) allows us to do this, as you can see in the following diagram:

The TZASC is similar to a Memory Protection Unit (MPU), and allows the address space of a device to split into several regions. With each region specified as Secure or Non-secure. The registers to control the TZASC are Secure access only, permitting only Secure software to partition memory.

An example of a TZASC is the Arm TZC-400, which supports up to nine regions.

Note: Off-chip memory is less Secure than on-chip memory, because it is easier for an attacker to read or modify its contents. On-chip memories are more secure but are much more expensive and of limited size. As always, we must balance cost, usability, and security. Be careful when deciding which assets you want in off-chip memories and which assets need to be kept on-chip.

Bus masters

Next, we will look at the bus masters in the system, as you can see in the following diagram:

The A-profile processors in the system are TrustZone aware and send the correct security status with each bus access. However, most modern SoCs also contain non-processor bus masters, for example, GPUs and DMA controllers.

Like with slave devices, we can roughly divide the master devices in the system into groups:

  • TrustZone aware
    Some masters are TrustZone aware, and like the processor, provide the appropriate security information with each bus access. Examples of this include System MMUs (SMMUs) that are built to the Arm SMMUv3 specification.

  • Non-TrustZone aware
    Not all masters are built with TrustZone awareness, particularly when reusing legacy IP. Such masters typically provide no security information with its bus accesses, or always send the same value.

What system resources do non-TrustZone-aware masters need to access? Based on the answer to this question, we could pick one of several approaches:

  • Design time tie-off
    Where the master only needs to access a single physical address space, a system designer can fix the address spaces to which it has access, by tying off the appropriate signal. This solution is simple, but is not flexible.

  • Configurable logic
    Logic is provided to add the security information to the master’s bus accesses. Some interconnects, like the Arm NIC-400, provide registers that Secure software can use at boot time to set the security of an attached master accesses. This overrides whatever value the master provided itself. This approach still only allows the master to access a single physical address space but is more flexible than a tie-off.

  • SMMU
    A more flexible option is an SMMU. For a trusted master, the SMMU behaves like the MMU in Secure state. This includes the NS bit in the translation table entries, controlling which physical address space is accessed.

M and R profile Arm processors

Many modern designs include a mixture of A-profile, R-profile, and M-profile processors. For example, a mobile device might have an A-profile processor to run the mobile OS, an R-profile processor for the cellular modem, and an M-profile processor for low-level system control. The following diagram shows an example mobile device and the different processors that you might find:

The R profile does not support the two Security states in the way that the A profile does. This means that software running on those processors cannot control the outputted physical address space. In this way, they behave much like other non-TrustZone aware bus masters. The same is true for M profile processors that do not implement TrustZone for Armv8-M.

Often these processors only need to access a single physical address space. Using our example of a mobile device, the processors typically include an M-profile processor for low-level system control. This is sometimes called a System Control Processor (SCP). In many systems, the SCP would be a Secure-only device. This means that it only needs the ability to generate bus secure accesses.

Interrupts

Next, we will look at the interrupts in the system, as you can see in the following diagram:

 

The Generic Interrupt Controller (GIC), supports TrustZone. Each interrupt source, called an INTID in the GIC specification, is assigned to one of three Groups:

  • Group 0: Secure interrupt, signaled as FIQ
  • Secure Group 1: Secure interrupt, signaled as IRQ or FIQ
  • Non-secure Group 1: Non-secure interrupt, signaled as IRQ or FIQ

This is controlled by software writing to the GIC[D|R]_IGROUPR<n> and GIC[D|R]_IGRPMODR<n> registers, which can only be done from Secure state. The allocation is not static. Software can update the allocations at run-time.

For INTIDs that are configured as Secure, only bus secure accesses can modify state and configuration. Register fields corresponding to Secure interrupts are read as 0s to Non-secure bus accesses.

For INTIDs that are configured as Non-secure, both Secure and Non-secure bus accesses can modify state and configuration.

Why are there two Secure Groups? Typically, Group 0 is used for interrupts that are handled by the EL3 firmware. These relate to low-level system management functions. Secure Group 1 is used for all the other Secure interrupt sources and is typically handled by the S.EL1 or S.EL2 software.

Handling interrupts

The processor has two interrupt exceptions, IRQ and FIQ. When an interrupt becomes pending, the GIC uses different interrupt signals depending on the group of the interrupt and the current Security state of the processor:

  • Group 0 interrupt
    • Always signaled as FIQ exception

  • Secure Group 1
    • Processor currently in Secure state – IRQ exception
    • Processor currently in Non-secure state – FIQ exception

  • Non-secure Group 1
    • Processor currently in Secure state – FIQ exception
    • Processor currently in Non-secure state – IRQ exception

Remember that Group 0 interrupts are typically used for the EL3 firmware. This means that:

  • IRQ means a Group 1 interrupt for the current Security state.
  • FIQ means that we need to enter EL3, either to switch Security state or to have the firmware handle the interrupt.

The following example shows how the exception routing controls could be configured:

The preceding diagram shows one possible configuration. Another option that is commonly seen is for FIQs to be routed to EL1 while in Secure state. The Trusted OS treats the FIQ as a request to yield to either the firmware or to Non-secure state. This approach to routing interrupts gives the Trusted OS the opportunity to be exited in a controlled manor.

Debug, trace and profiling

Next, we will look at the debug and trace components in the system, as you can see in the following diagram:

Modern Arm systems include extensive features to supporting debugging and profiling. With TrustZone, we must ensure that these features cannot be used to compromise the security of the system.

Regarding debug features, consider the development of a new SoC. Different developers are trusted to debug different parts of the system. The chip company engineers need, and are trusted to, debug all parts, including the Secure state code. Therefore, all the debug features should be enabled.

When the chip ships to an OEM, they still need to debug the Non-secure state software stack. However, the OEM might be prevented from debugging the Secure state code.

In the shipping product containing the chip, we might want some debug features for application developers. But we also want to limit the ability to debug the code of the silicon provider and the OEM.

Signals to enable the different debug, trace, and profiling features help us deal with this situation. This includes separate signals to control use of the features in Secure state and Non-secure state.

Continuing with the debug example, these signals include:

  • DBGEN – Top-level invasive debug enable, controls external debug in both Security states
  • SPIDEN – Secure Invasive Debug Enable, controls external ability to debug in Secure state

Note: These two signals are examples. There are other debug authentication signals. Refer to the Technical Reference Manual of your processor for a complete list.

Here is an example of how we might use these signals:

  • Early development by chip designer
    • DGBEN==1 and SPIDEN==1, enabling full external debug

  • Product development by OEM
    • DBGEN==1, enabling external debug in Non-secure state
    • SPIDEN==0, disabling debug in Secure state

  • Shipping product
    • DGBEN==0 and SPIDEN==0, disabling external debug in both Security states
    • Debug of applications still possible

Because we want different signal values at different stages of development, it is common to connect the signals using e-fuses or authentication blocks. Here is an example:

By blowing fuses during manufacture, external debug can be permanently disabled. Using fuses does make in-field debug more difficult. When the fuses are blown, they cannot be unblown. An authentication module is more flexible.

Other devices

Finally, we will look at the other devices in the system, as you can see in the following diagram:

 

Our example TrustZone-enabled system includes several devices which we have not yet covered, but which we need to build a practical system.

  • One-time programmable memory(OTP) or fuses
    These are memories that cannot be changed once they are written. Unlike a boot ROM which contains the same image on each chip, the OTP can be programmed with device unique values and possibly OEM unique values.
    One of the things that is stored in OTP is a device unique private key. When each chip is manufactured, a randomly generated unique key is written to the OTP. This device unique private key is used to tie data to the chip.
    The advantage of a device unique private key is that it prevents class attacks. If each chip had the same key, if one device is compromised then all similar devices would also be vulnerable.
    OTP is also often used to store hashes of OEM public keys. OTP is relatively expensive compared to other memories. For public keys, only storing the hash and not storing the full key saves cost.

  • Non-volatile counter
    Non-volatile (NV) counter, which might be implemented like more fuses. This is a counter that can only increase and can never be reset.
    NV counters are used to protect against rollback attacks. Imagine that there is a known vulnerability in version 3 of the firmware on a device. The device is currently running version 4, on which the vulnerability is fixed. An attacker might try to downgrade the firmware back to version 3, to exploit the known vulnerability. To protect against this, each time the firmware is updated the count is increased. At boot, the version of the firmware is checked against the NV counter. If there is mismatch, the device knows that it is being attacked.

  • Trusted RAM and Trusted ROM
    These are on-chip Secure access only memories.
    The Trusted ROM is where the first boot code is fetched from. Being on-chip means that an attacker cannot replace it. Being a ROM means that an attacked cannot reprogram it. This means that we have a known, trusted, starting point of execution, and will be discussed in the Software architecture section of this guide.
    The Trusted RAM is typically an SRAM of a couple of hundred kilobytes. This is the working memory of the software that is running in Secure state. Again, being on-chip makes it difficult for an attacker to gain access to its content.

Trusted Base System Architecture

The Trusted Base System Architecture (TBSA) is a set of guidelines from Arm for system designers. TBSA provides recommendations on what resources different use cases require, for example, how many bits of OTP are required.

Previous Next