For the first time, there is also an integrated Generic Interrupt Controller (GIC), Snoop Control Unit (SCU) and timers to further reduce latency and enable symmetric multiprocessing in a dual core configuration.
- Very high performance real-time processor with option for single or dual core.
- Fully cache coherent to simplify dual core software development.
- Always reacts to system events in a fast and deterministic manner.
- Tightly Coupled Memory to keep critical instructions and data close to the core for fast, immediate, and deterministic processing.
- Immediate, deterministic, and coherent control of system peripherals.
- Integrated Generic Interrupt Controller (GIC) for fast and deterministic interrupt responses.
|Instruction Set||Arm and Thumb-2. Supports DSP instructions and optional Floating-Point Unit with single-precision or double precision.
|Microarchitecture||11-stage pipeline with instruction pre-fetch, branch prediction, superscalar and out of order execution, register renaming, parallel execution paths for load-store, MAC, shift-ALU, divide and floating-point. Also features a hardware divider and is binary compatible with the Arm9, Arm11, Cortex-R4 and Cortex-R5 embedded processors.
|Cache controllers||Harvard memory architecture with optional integrated Instruction and Data cache controllers. Cache sizes configurable from 4 to 64KB. Cache lines are write-through.
|Tightly-Coupled Memories||Optional Tightly-Coupled Memory interfaces are for highly deterministic or low-latency applications that may not respond well to caching (e.g. instruction code for interrupt service routines and data that requires intense processing). Instruction and/or data TCMs. TCM size can be up to 128KB.
|Interrupt Interface||Standard interrupt, IRQ, non-maskable fast interrupt, FIQ, inputs are provided together with a fully integrated Generic Interrupt Controller (GIC) supporting complex priority-based interrupt handling. The processor includes low-latency interrupt technology that allows long multi-cycle instructions to be interrupted and restarted. Deferral of lengthy memory accesses occurs in certain circumstances.
|Memory Protection Unit (MPU)||Optional MPU configures attributes for up to sixteen regions, each with resolution down to 256 Bytes. Regions can overlap, and the highest numbered region has highest priority.
|Floating-Point Unit (FPU)||Optional FPU implements the Arm Vector Floating Point architecture VFPv3 with 16 double-precision registers, compliant with IEEE754. There is support for two FPU options: either a single precision-only or both single and double precision. The FPU performance is optimized for both single and double precision calculations. Operations include add, subtract, multiply, divide, multiply and accumulate, square root, conversions between fixed and floating-point, and floating-point constant instructions.
|ECC||Optional single-bit error correction and two-bit error detection for cache and/or TCM memories and all interfaces with ECC bits. Single-bit soft errors are automatically corrected by the processor. In addition, full and flexible support for managing hard errors.
|Master AMBA AXI bus||64-bit AMBA AXI bus master for Level-2 memory and peripheral access.
|Slave AXI bus||Optional 64-bit AMBA AXI bus slave port allows DMA masters to access the TCMs for high speed streaming of data in and out of the processor.
|Low Latency Peripheral Port (LLPP)||A dedicated 32-bit AMBA AXI port to integrate latency-sensitive peripherals more tightly with the processor.
|Accelerator Coherency Port (ACP)||A 64-bit AMBA AXI slave port to enable coherency between the processor(s) and external intelligent peripherals such as DMA controllers, Ethernet or Flexray interfaces.
|Low latency memory port||A 64-bit AMBA AXI master port designed specifically to connect to local memory. This local memory provides many of the benefits of TCM and in addition can be slower and lower power and also easily shared between coherent peripherals and the one or two Cortex-R7 processor cores.
|Dual-core||A dual processor configuration for either a redundant Cortex-R7 CPU in lock-step for fault tolerant/fault detecting dependable systems or dual cores running independently, each executing its own program with its own bus interfaces, interrupts etc.
|Debug||Debug Access Port is provided. Its functionality can be extended using CoreSight SoC-400.
|Trace||An interface suitable for connection to CoreSight Embedded Trace Module is present.
Looking for more information on Arm Cortex-R7?
Get in touch to speak with one of our technical experts.
The Cortex-R7 processor is designed for implementation on advanced silicon processes and offers high performance, high energy efficiency, real-time responsiveness, advanced features, and ease of system design. It is typically used in demanding real-time applications, such as:
Wireless modems (LTE, 5G)
Storage (HDD, SSD)
Networking and Routers
High Performance Real-Time Embedded Processing
11 stage superscalar out-of-order pipeline with advanced dynamic and static branch prediction and dynamic register renaming for extreme real-time performance.
Cache Coherent Multiprocessing System
Symmetric or asymmetric multiprocessing capabilities with cache coherency through the Snoop Control Unit (SCU) and I/O coherency through the Accelerator Coherency Port (ACP).
Advanced Memory System
Tiered memory system enabling multiple levels of performance, latency and cost choices. Tightly Coupled Memory (TCM), Low Latency RAM (LLRAM) are options in addition to the caches and main memory system.
Integrated Interrupt Controller
Integral Generic Interrupt Controller (GIC) for fast and deterministic interrupt processing.
Processor area, frequency and power consumption are highly dependent on process, libraries and optimizations. The table below estimates a typical single processor implementation of the Cortex-R7 processor on mainstream high performance for mobile process technology (28nm HPM) with high-density, standard-performance cell libraries and 32KB instruction cache and 32KB data cache.
|Single processor systems||28 nm HPM|
|Maximum Clock frequency||Above 1.5 GHz|
|Performance||2.50 / 2.90 / 3.77 DMIPS/MHz *
|Total area (Including Core+RAM+Routing)||From 0.33 mm2|
|Efficiency||From 46 DMIPS/mW|
* The first result abides by all of the 'ground rules' laid out in the Dhrystone documentation, the second permits inlining of functions (not just the permitted C string libraries) while the third additionally permits simultaneous multifile compilation. All are with the original (K&R) v2.1 of Dhrystone.
** CFLAGS ="--endian=little --cpu=Cortex-R7 --fpu=None -Ohs --no_size_constraints"
The Cortex-R7 processor can be incorporated into a SoC using a broad range of Arm technology including System IP and Physical IP. It is fully supported by Arm development tools. Related IP and tools include:
Cortex-R7 Technical Reference Manual
In-depth technical information on the Cortex-R7 for system designers, integrators, verification engineers and software engineers.Technical Reference Manual
Cortex-R Series Programmer's Guide
For Software developers working in assembly language, this guide covers programming Cortex-R series devices.Get the guide
Development Tools for Cortex-R Series
DS-5 Development Studio and a range of 3rd party and open source tools support Cortex-R series software development.Software Tools for Cortex-R
Arm Design Reviews
Arm's on-site design review service gives licensees confidence that their Cortex-R7 CPU is implemented efficiently, to provide maximum system performance, with lowest risk and fastest time-to-market.Explore Arm Design Reviews
Questions? Request more information
Learn more about Cortex-R7, Arm’s upgraded 11-stage, superscalar, out-of-order pipeline processor. Contact us to speak with our technical team.Find out more