Cortex-M4 Overview

The ARM Cortex-M4 processor is ARM’s high performance embedded processor developed to address digital signal control markets that demand an efficient, easy-to-use blend of control and signal processing capabilities.

The combination of high-efficiency signal processing functionality with the low-power, low cost and ease-of-use benefits of the Cortex-M family of processors is designed to satisfy the emerging category of flexible solutions specifically targeting the motor control, automotive, power management, embedded audio and industrial automation markets.

  • Cortex-M4 Technical Reference Manual

    For system designers, integrators and software programmers building or writing software for systems containing a Cortex-M4.

    Technical Reference Manual


Architecture ARMv7E-M Harvard 
ISA Support Thumb® / Thumb-2
DSP Extensions Single cycle 16/32-bit MAC
Single cycle dual 16-bit MAC
8/16-bit SIMD arithmetic
Hardware Divide (2-12 Cycles)
Floating Point Unit Single precision floating point unit
IEEE 754 compliant
Pipeline 3-stage + branch speculation
Memory Protection Optional 8 region MPU with sub regions and background region
Interrupts Non-maskable Interrupt (NMI) + 1 to 240 physical interrupts
Interrupt Priority Levels 8 to 256 priority levels
Wake-up Interrupt Controller Up to 240 Wake-up Interrupts
Sleep Modes Integrated WFI and WFE Instructions and Sleep On Exit capability.
Sleep & Deep Sleep Signals.
Optional Retention Mode with ARM Power Management Kit
Bit Manipulation Integrated Instructions & Bit Banding
Debug Optional JTAG & Serial-Wire Debug Ports. Up to 8 Breakpoints and 4 Watchpoints.
Trace Optional Instruction Trace (ETM), Data Trace (DWT), and Instrumentation Trace (ITM)

Key Features

SIMD, saturating arithmetic, fast MAC

Powerful instruction set for accelerating DSP applications, built right into the processor.  A highly optimised DSP library built using these instructions is available free-of-charge from the ARM website.

Powerful debug and non-obtrusive real-time trace

Comprehensive debug and trace features dramatically improve developer productivity. It is extremely efficient to develop embedded software with proper debug.

Memory Protection Unit (MPU)

Software reliability improves when each module is allowed access only to specific areas of memory required for it to operate. This protection prevents unexpected access that may overwrite critical data.

Integrated nested vectored interrupt controller (NVIC)

here is no need for a standalone external interrupt controller. Interrupt handling is taken care of by the NVIC removing the complexity of managing interrupts manually via the processor.

Thumb-2 code density

On average, the mix between 16bit and 32bit instructions yields better code density when compared to 8bit and 16bit architectures. This has significant advantages in terms of reduced memory requirements and maximizing the usage of precious on-chip Flash memory.

Cortex-M4 Characteristics

Performance Efficiency: 3.40 CoreMark/MHz* and without FPU: 1.25 / 1.52 / 1.91 DMIPS/MHz**, With FPU: 1.27 / 1.55 / 1.95 DMIPS/MHz**

ARM Cortex-M4 Implementation Data***
(7-track, typical 1.8v, 25C)
(7-track, typical 1.2v, 25C)
(9-track, typical 1.1v, 85C)
Dynamic Power 151 µW/MHz 32.82 µW/MHz 12.26 µW/MHz
Floorplanned Area 0.44 mm2 0.119 mm2 0.028 mm2

 * see:

 ** The first result abides by all of the “ground rules” laid out in the Dhrystone documentation, the second permits inlining of functions, not just the permitted C string libraries, while the third additionally permits simultaneous (”multi-file”) compilation. All are with the original (K&R) v2.1 of Dhrystone 

 *** Base usable configuration includes DSP extensions, 1 IRQ + NMI, excludes ETM, MPU, FPU  and debug