Address spaces in AArch64
There are several independent virtual address spaces in Armv8-A. This diagram shows these virtual address spaces:
The diagram shows three virtual address spaces:
- NS.EL0 and NS.EL1 (Non-secure EL0/EL1).
- NS.EL2 (Non-secure EL2).
Each of these virtual address spaces is independent, and has its own settings and tables. We often call these settings and tables 'translation regimes'. There are also virtual address spaces for Secure EL0, Secure EL1 and Secure EL2, but they are not shown in the diagram.
Note: Support for Secure EL2 was added in Armv8.4-A.
Because there are multiple virtual address spaces, it is important to specify which address space an address is in. For example, NS.EL2:0x8000 refers to the address 0x8000 in the Non-secure EL2 virtual address space.
The diagram also shows that the virtual addresses from Non-secure EL0 and Non-secure EL1 go through two sets of tables. These tables support virtualization and allow the hypervisor to virtualize the view of physical memory that is seen by a virtual machine (VM).
In virtualization, we call the set of translations that are controlled by the OS, Stage 1. The Stage 1 tables translate virtual addresses to intermediate physical addresses (IPAs). In Stage 1 the OS thinks that the IPAs are physical address spaces. However, the hypervisor controls a second set of translations, which we call Stage 2. This second set of translations translates IPAs to physical addresses. This diagram shows how the two sets of translations work:
Although there are some minor differences in the table format, the process of Stage 1 and Stage 2 translation is usually the same.
Note: At Arm, we use the address 0x8000 in many of our examples. 0x8000 is also the default address for linking with the Arm linker, armlink. The address comes from an early microcomputer, the BBC Micro Model B, which had ROM (and sideways RAM) at the address 0x8000. The BBC Micro Model B was built by a company called Acorn, which developed the Acorn RISC Machine (ARM), and later became Arm.
Armv8-A is a 64-bit architecture, but this does not mean that all addresses are 64-bit.
Size of virtual addresses
Virtual addresses are stored in a 64-bit format. As a result, the address in load instructions (LDR) and store instructions (STR) is always specified in an X register. However, not all of the addresses in the X register are valid.
This diagram shows the layout of the virtual address space in AArch64:
There are two regions for the EL0/EL1 virtual address space: kernel space and application space. These two regions are shown on the left-hand side of the diagram, with kernel space at the top, and application space, which is labelled 'User space', at the bottom of the address space. Kernel space and user space have separate translation tables and this means that their mappings can be kept separate.
There is a single region at the bottom of the address space for all other Exception levels. This region is shown on the right-hand side of the diagram and is the box with no text in it.
Note: If you set HCR_EL2.E2H to 1 it enables a configuration where a host OS runs in EL2, and the applications of the host OS run in EL0. In this scenario, EL2 also has an upper and a lower region.
Each region of address space has a size of up to 252 bytes. However, each region can be independently shrunk to a smaller size. The TnSZ fields in the TCR_ELx registers control the size of the virtual address space. For example, this diagram shows that TCR_EL1 controls the EL0/EL1 virtual address space:
The virtual address size is encoded as:
virtual address size in bytes = 264-TCR_ELx.TnSZ
The virtual address size can also be expressed as a number of address bits:
Number of address bits = 64 - TnSZ
Therefore, if TCR_EL1.SZ1 is set to 32, the size of the kernel region in the EL0/EL1 virtual address space is 232 bytes (0xFFFF_FFFF_0000_0000 to 0xFFFF_FFFF_FFFF_FFFF). Any address that is outside of the configured range or ranges will, when it is accessed, generate an exception as a translation fault. The advantage of this configuration is that we only need to describe as much of the address space as we want to use, which saves time and space. For example, imagine that the OS kernel needs 1GB of address space (30-bit address size) for its kernel space. If the OS sets T1SZ to 34, then only the translation table entries to describe 1GB are created, as 64 – 34 = 30.
Note: All Armv8-A implementations support 48-bit virtual addresses. Support for 52-bit virtual addresses is optional and reported by ID_AA64MMFR2_EL1. At the time of writing, none of the Arm Cortex-A processors support 52-bit virtual addresses.
Size of physical addresses
The size of a physical address is IMPLEMENTATION DEFINED, up to a maximum of 52 bits. The ID_AA64MMFR0_EL1 register reports the size that is implemented by the processor. For Arm Cortex-A processors, this will usually be 40 bits or 44 bits.
Note: In Armv8.0-A, the maximum size for a physical address is 48 bits. This was extended to 52 bits in Armv8.2-A.
Size of intermediate physical addresses
If you specify an output address in a translation table entry that is larger than the implemented maximum, the Memory Management Unit (MMU) will generate an exception as an address size fault.
The size of the IPA space can be configured in the same way as the virtual address space. VTCR_EL2.T0SZ controls the size. The maximum size that can be configured is the same as the physical address size that is supported by the processor. This means that you cannot configure a larger IPA space than the supported physical address space.
Address Space Identifiers - Tagging translations with the owning process
Many modern OSs have applications that all seem to run from the same address region, this is what we have described as user space. In practice, different applications require different mappings. This means, for example, that the translation for VA 0x8000 depends on which application is currently running.
Ideally, we would like the translations for different applications to coexist within the Translation Lookaside Buffers (TLBs), to prevent the need for TLB invalidates on a context switch. But how would the processor know which version of the VA 0x8000 translation to use? In Armv8-A, the answer is Address Space Identifiers (ASIDs).
For the EL0/EL1 virtual address space, translations can be marked as Global (G) or Non-Global (nG) using the nG bit in the attributes field of the translation table entry. For example, kernel mappings are Global translations, and application mappings are Non-Global translations. Global translations apply whichever application is currently running. Non-Global translations only apply with a specific application.
Non-Global mappings are tagged with an ASID in the TLBs. On a TLB lookup, the ASID in the TLB entry is compared with the currently selected ASID. If they do not match, then the TLB entry is not used. This diagram shows a Global mapping in the kernel space with no ASID tag and a non-Global mapping in user space with an ASID tag:
The diagram shows that TLB entries for multiple applications are allowed to coexist in the cache, and the ASID determines which entry to use.
The ASID is stored in one of the two TTBRn_EL1 registers. Usually TTBR0_EL1 is used for user space. As a result, a single register update can change both the ASID and the translation table that it points to.
Note: ASID tagging is also available in EL2, when HCR_EL2.E2H==1.
Virtual Machine Identifiers - Tagging translations with the owning VM
EL0/EL1 translations can also be tagged with a Virtual Machine Identifier (VMID). VMIDs allow translations from different VMs to coexist in the cache. This is similar to the way in which ASIDs work for translations from different applications. In practice, this means that some translations will be tagged with both a VMID and an ASID, and that both must match for the TLB entry to be used.
Note: When virtualization is supported for a security state, EL0/EL1 translations are always tagged with a VMID – even if Stage 2 translation is not enabled. This means that if you are writing initialization code and are not using a hypervisor, it is important to set a known VMID value before setting up the Stage 1 MMU.
Common not Private
If a system includes multiple processors, do the ASIDs and VMIDs used on one processor have the same meaning on other processors?
For Armv8.0-A the answer is that they do not have to mean the same thing. There is no requirement for software to use a given ASID in the same way across multiple processors. For example, ASID 5 might be used by the calculator on one processor and by the web browser on another processor. This means that a TLB entry that is created by one processor cannot be used by another processor.
In practice, it is unlikely that software will use ASIDs differently across processors. It is more common for software to use ASIDs and VMIDs in the same way on all processors in a given system. Therefore, Armv8.2-A introduced the Common not Private (CnP) bit in the Translation Table Base Register (TTBR). When the CnP bit is set, the software promises to use the ASIDs and VMIDs in the same way on all processors, which allows the TLB entries that are created by one processor to be used by another.
Note: We have been talking about processors, however, technically, we should be using the term, Processing Element (PE). PE is a generic term for any machine that implements the Arm architecture. It is important here because there are microarchitectural reasons why sharing TLBs between processors would be difficult. But within a multithreaded processor, where each hardware thread is a PE, it is much more desirable to share TLB entries.