Navigating the Cortex Maze

Introduction The ARM architecture is 28 years old this year. The ARM1 first ran code on 26th April 1985 at the office of Acorn Computers in Cambridge, UK. Since the formation of ARM (then called “Advanced RISC Machines”) in 1990, the arc...

By Chris Shore

Reading time 6 minutes

Introduction

The ARM architecture is 28 years old this year. The ARM1 first ran code on 26^th April 1985 at the office of Acorn Computers in Cambridge, UK.

Since the formation of ARM (then called “Advanced RISC Machines”) in 1990, the architecture has gone on to become the most popular and widely-used in the world, with over 10B devices shipped every year. But the architecture hasn’t stood still! On the contrary, it has been through several incremental changes to get where it is today. The most significant change occurred in 2004 with the release of architecture ARMv7. This defined three architecture profiles:

ARMv7-A – “Application”
ARMv7-R – “Real-Time”
ARMv7-M – “Microcontroller”

This was a result of recognising that a single architecture was not capable of addressing the incredibly diverse needs of the electronics and computing industry. The processing requirements of a smartphone or tablet are very different from those of a smart electricity meter, for instance.ARM’s current range of mainstream processors is marketed under the “Cortex” brand. So, Cortex-M3 is a processor supporting architecture ARMv7-M and is targeted at microcontroller applications. Regardless of the target application and performance point, all ARM processors are 32-bit devices with a full 32-bit ALU and 32-bit register set.

In this short article, I want to look at the key features and target markets of each profile to give you a headstart when trying to select a processor for a new design.

ARMv7-A Application cores (Cortex-A)

The ARMv7-A architecture profile represents the pinnacle of the architecture in terms of performance and capability. ARMv7-A cores, such as Cortex-A15, are targeted at high-performance systems running some kind of platform OS, like Linux or Android. Internally, they have very complex microarchitectures making use of features like out-of-order execution, superscalar pipelines, branch prediction, register renaming etc.

These processors have the following features in common:

Virtual memory
All Cortex-A cores contain a Memory Management Unit (MMU) with full virtual-physical address translation capability. Later cores, such as Cortex-A15 and Cortex-A7, also include an extension which allows for 40-bit physical addressing. The memory architecture they support is termed the Virtual Memory System Architecture for ARMv7-A (VMSAv7).
Multi-level cache support
The architecture allows support for up to eight levels of cache, though it is not common to see more than three on even high-end systems.
Multicore capability
With the exception of Cortex-A8, all Cortex-A cores include support for multicore configurations. Typically, they can be implemented in clusters of up to four cores with hardware data coherency at L1. Systems can be extended beyond a single cluster using coherency support in the external memory system.
TrustZone security
TrustZone is a standard extension to all Cortex-A processors. It creates two virtual machines, running on a single processor, with carefully controlled partitioning between the two. This allows for implementation of highly secure systems for applications such as DRM or e-payment.
Virtualization (Cortex-A15 and Cortex-A7 only)
Later Cortex-A cores include a set of extensions which provide support for a hardware virtualization solution. Key system registers and components are virtualized in hardware allowing a hypervisor to create multiple virtual machines and host multiple guest operating systems.
NEON (Advanced SIMD Extension)
NEON is an optional extension to ARMv7-A which provides an instruction set and register bank for high-performance SIMD multimedia programming. NEON provides acceleration for key algorithms in data compression, transcoding, image processing etc. The NEON architecture also provides support for single-precision floating point.
Floating Point
An FPU is an optional extension on these cores, providing support for double-precision floating point.

ARMv7-R Real-Time cores (Cortex-R)

The ARMv7-R architecture is similar in many ways to ARMv7-A. The programmer’s model is largely the same. However, key differences make the Cortex-R cores more suitable for high-performance real-time applications. Examples would be engine management, or hard disk drive controllers. Because they are typically used in highly specialised areas, Cortex-R based devices are generally custom-built for specific applications.

The following are key features of Cortex-R processors:

Hardware divide
Traditionally, ARM processors have not supported division in hardware and software run-time library routines have been required. To provide high-performance data processing in demanding applications, Cortex-R cores support hardware division instructions.
Memory protection
Unlike Cortex-A, Cortex-R processors do not have a MMU and do not therefore support virtual address translation. Instead, they incorporate a Memory Protection Unit (MPU) which allows memory to be partitioned into secure regions with specific access control attributes.
NMI support
To support safety-critical systems, Cortex-R processors have optional support for a Non-Maskable Fast Interrupt.
Tightly Coupled Memory
As well as caches, Cortex-R processors support dedicated interfaces which can be connected to fast on-chip SRAM. This is called Tightly-Couple Memory (TCM) This provides configurable regions of memory, close to the processor, which can be accessed extremely quickly and efficiently, greatly enhancing real-time performance. External DMA access to TCM is also supported.
Deterministic interrupt behavior
Optionally, the behaviour of some instructions can be modified to improve the speed and predictability of interrupt latency for application areas where this is important. This is referred to as Low latency Interrupt Mode (LLIM).
Safety and fault-tolerance features
The L1 memory system and buses incorporate ECC and parity error detection/correction, a feature which is required for many safety-critical applications. Coretx-R5 and Cortex-R7 can also be implemented in a Dual-Core Lock Step (DCLS), providing hardware redundancy.

ARMv7-M Microcontroller cores (Cortex-M)

The microcontroller architecture, ARMv7-M, is significantly different from the others in several important ways. The motivation is to allow implementation of small, cost-effective, power-efficient devices. Although the emphasis is often on cost and size, Cortex-M devices are also capable of providing very high processing performance when required. From the “smallest” in the range, the Cortex-M0, to the top-of-the-range Cortex-M4 (which has optional floating point support), they span a huge range of performance points.

They have the following features in common.

Energy efficiency
All the Cortex-M cores support a range of power-efficient architectural sleep and standby states. When coupled with multiple power domains, this allows very energy-efficient devices to be designed.
High-density instruction set
Cortex-M processors only support the variable-length Thumb-2 instruction set. This allows for very dense and efficient code, while retaining full 32-bit processing capability.
High standardization
The Cortex-M architecture specifies a fixed memory map and a small set of standard peripherals, including a vectored interrupt controller and a system timer. This encourages a high degree of standardization across vendors, tools and operating systems, building a strong ecosystem around standard parts from multiple sources.
Simple programmer’s model
In the smallest configuration Cortex-M processors support a two operating mode (with no concept of privilege), a single stack up and a single, simple register set. There is optional support on the higher-end processors for privileged operation, separate process and exception stacks. This supports everything from the simplest bare-metal application to more demanding requirements which need a real-time operating system.
Optional memory protection
All except the Cortex-M0 offer optional Memory Protection Units which allow the memory map to be partitioned into regions which have configurable access protection attributes. This allows for clear and secure separation, for instance, between operating system and user application code.
Optional floating point support
The Cortex-M4 comes with an optional Floating Point Unit, which supports IEEE-754 single-precision floating point in hardware.
Simple debug solutions
The debug architecture of Cortex-M processors is highly configurable. The usual JTAG port is optional and can be replaced with Serial Wire Debug in applications where pin count is important. The number of breakpoints and watchpoints can be configured at design time, as can the trace

Summary

The current range of ARM Cortex processors can be a bit bewildering. It supports requirements from servers, tablets and smartphones, through high-performance real-time, down to the tiniest microcontrollers. Hopefully, this article gives you a way in to the diversity of devices and options which are available. You can find more information about all of them on ARM's website. And if you have questions about specific devices or requirements, where else better to ask them than right here in the community!

Chris

By Chris Shore

Article text

Re-use is only permitted for informational and non-commercial or personal use only.