Architecture	Armv8-R AArch64 Compliant with Armv8.4-A extensions
Instruction Set	A64 instruction set
Microarchitecture	Eight-stage, in-order, superscalar pipeline with direct and indirect branch prediction.
Cache controllers	Separate L1 data cache and L1 instruction cache private to each core, with a configurable size. An optional, shared (between all cores), and unified (instructions and data) L2 cache that can be configured up to 4MB. Partial L2 cache power-down support.
Tightly-Coupled Memories (TCM)	Two optional TCMs private to each core: an ITCM for instructions and literal pool data and a DTCM for data. Both TCMs have a configurable size that can go up to 1MB.
Cache protection	Reliability, Availability, and Serviceability (RAS) extension. Optional Error Correcting Code (ECC), Single Error Correct Double Error Detect (SECDED) or Double Error Detect (DED) protection for all of the instantiated cache tag and data RAMs, the TCM RAMs, and the TLB RAMs.
Interrupt interface	Standard interrupt, IRQ, FIQ, inputs are provided together with an interface to an external GICv3.2-compliant Generic Interrupt Controller (GIC) supporting complex priority-based interrupt handling. The processor includes low-latency interrupt technology that allows long multicycle instructions to be interrupted and restarted. Deferral of lengthy memory accesses occurs in certain circumstances.
Memory Protection Unit (MPU)	Two optional and programmable MPUs controlled from EL1 and EL2 respectively. Configure attributes for up to 32 regions per MPU. Regions cannot overlap.
Memory Management Unit (MMU)	Optional EL1 MMU for fine-grained memory system control through virtual-to-physical address mappings and memory attributes held in translation tables.
Floating Point Unit (FPU) and Advanced SIMD (Neon)	Optional FPU implementing the Arm Vector and Floating Point architecture VFPv4 with 32 x 128-bit registers, compliant with IEEE754. There is support for: Advanced SIMD Half precision Single precision Double precision
Main Manager Interface	Shared Main Manager (MM) port implemented as AXI5 256-bit providing access for instructions, data, and peripherals. This interface can optionally be a 256-bit CHI-E interface.
Subordinate Interface	128-bit shared AXI-S port used for two purposes: As an LLRAM Accelerator Coherency Port enabling I/O coherent external access to the LLRAM port. As a TCM subordinate enabling external agents to access the TCMs within the cores.
Low Latency RAM Port (LLRAM)	Optional AXI5 256-bit shared LLRAM port providing low-latency access for instructions and data. The port is designed to connect to local memory. This local memory provides many of the benefits of TCM and in addition can be slower and lower power and also easily shared between the up-to-eight processor cores.
Shared Peripheral Port (SPP)	Optional AXI5 64-bit SPP for providing access to peripherals.
Low Latency Peripheral Port (LLPP)	An optional per core dedicated 32-bit AXI5 port to integrate latency-sensitive peripherals tightly with a specific core within the processor.
Main Accelerator Coherency Port (MACP)	ACE5-Lite 128-bit shared subordinate MACP for external access to MM address ranges. MACP enables I/O coherency for external agents with the per-core L1 data cache and shared L2 cache.
Up to eight cores	With in-cluster hardware coherency.
Debug	Debug Access Port is provided. Its functionality can be extended using Coresight Debug and Trace.
Trace	Cortex-R82 includes one CoreSight Embedded Trace Module per core.
Additional Features	Can address up to 1TB of DRAM volatile memory that fulfils requirements of emerging memory technologies. The optional Memory Management Unit (MMU) enables rich operating systems (OS), like, Linux and Android, supported by an ecosystem offering a software stack and development tools. Delivers higher compute performance for complex data storage applications, including Computational Storage Drives (CSDs). Implementing the Cortex-R82 processor in a Solid State Drive (SSD) based on NVMe or NVMe-oF specifications or CXL architecture enables efficient parallel computation at the data itself. Suited for 5G modems, that require very high-performance and deterministic operations for low-latency operations. As well as delivering the high-throughput requirements in smartphones and laptops.

Characteristics

Processor area, frequency, and power consumption are highly dependent on process, libraries, and optimizations as well as the configuration selected. The following characteristics table estimates a typical one-core implementation of the Cortex-R82 processor, configured with 32KB L1 instruction cache, 32KB L1 data cache, 32KB of ITCM, 32KB of DTCM and full Advanced SIMD and floating-point engine.

Cortex-R82	7nm
Maximum Clock Frequency	Above 1.8GHz
Performance	3.71 / 4.32 / 8.67DMIPS/MHz * 6.28CoreMark/MHz **
Total Core Area	From 0.32mm²

Note:

* Benchmark built with GCC 9.2. The first result abides by all of the 'ground rules' laid out in the Dhrystone documentation, the second permits inlining of functions (not just the permitted C string libraries) while the third additionally permits link time optimizations. All are with the version 2.1 of Dhrystone and ANSI-C-style function declarations.

** Benchmark built with Arm Compiler for Embedded 6.17 Ultimate (AC6.17) using “-Omax -fomit-frame-pointer -fno-common -fno-vectorize -flto“ between others.