How do cache policies work on the Cortex-M7?
Information in this article applies to:
How do cache policies work on the Arm Cortex-M7?
This knowledge article is relevant for programmers who are using a Cortex-M7 based device that is configured with caches included. It is particularly useful to programmers who are using caches for the first time.
Cortex-M7 is a high-performance processor and is one of the Arm microcontroller architecture profiles. The Cortex-M7 provides various hardware options for the chip designer. These hardware options include optional instruction and data caches inside the processor and options on how cache policy information is signaled on the memory bus. To find out which options have been used in a specific device within a Cortex-M7 system, consult the product documentation from your silicon provider.
A cache is a fast memory which is local to the processor and which can hold copies of data from locations in the main memory. Often, access to main memory through a memory bus takes several clock cycles. Therefore, caching a local copy in fast memory improves performance when the addresses in the main memory are accessed many times.
The term "cache policy" refers to the set of behaviors that control what happens to data in the cache. On the one hand, "cache allocation policy" determines what data is copied from main memory into the cache. On the other hand, "cache write policy" determines when updates that the processor makes to the data are written out back to memory.
Cortex-M7 uses standard cache policies that are common to other Arm processors.
The cache allocation policy for an address range is one of the following:
Allocate on read miss.
Allocate on read or write miss.
If the processor tries to load or fetch from an address that is not in the cache, and the cache allocation policy for that address is "Allocate on read miss", the processor reads a block of eight words that includes the required address (a cache line) from memory and stores it in the cache. This process is called a cache linefill.
If the cache allocation policy for an address is "Allocate on read or write miss" and that address is not already in the cache, the processor performs a cache linefill if there is either a read or a write to that address.
The cache write policy is either "Write-Through" or "Write-Back" (also known as "copy-back"). If the cache write policy for an address is Write-Through and the address is in the cache, a store by the processor to that address updates the data in the cache and also writes the data out to main memory. However, if the cache write policy is Write-Back, a store by the processor that hits in the cache updates the data in the cache, but it does not write the data to memory. The result is a cache line that is marked as "dirty". A dirty cache line is the only up-to-date copy of the data for that address. The cache line is written back to memory later, either after a request to clean the cache, or when the cache line is recycled. Cache lines are recycled to allow a different cache line to be loaded into the cache.
An attribute that is called "shareable" also affects the behavior of the data cache for an address. The shareable attribute indicates that another processor or agent in the system can read or write to this address in the main memory. This attribute creates the following coherency problem: The local cached copy of data for an address might not match the content of main memory for the same address. In Cortex-M7, the processor does not cache shareable memory and so avoids this coherency issue. However, it is possible to change this behavior by writing to the L1 Cache Control Register, the CM7_CACR. For more information on the CM7_CACR register, read section 3.3.8 of the ARM® Cortex®-M7 Technical Reference Manual (TRM).
The ARM®v7-M Architecture Reference Manual (ARM), section B3.1, defines a default system address map with a specific cache policy for each region of Normal type memory. The default cache policies are either of the following:
Write-Through with allocate on read miss (WT)
Write-Back with allocate on read or write miss (WBWA).
Regions of Device type memory are not cacheable.
In addition, if the chip designer has included the optional Memory Protection Unit (MPU), you can overwrite the memory type for most regions. You can also apply custom cache policy settings for regions of Normal type memory that are programmed in the MPU. The settings that are available are described in section B3.5.9 of the ARM®v7-M ARM. These settings include WT, WBWA and an option for write-back with allocate on read miss (WB).
These custom cache policies are further divided into inner and outer policies, and you can choose different policies for each one. The caches inside the processor respond to the inner policy settings. The outer policy is signaled on the memory bus. The outer policy is used by extra levels of caching that are implemented outside of the processor in the memory system. An example of this type of extra level of caching is a level 2 cache controller. However, Cortex-M7 also exposes the inner cache policy settings as external signals. As a result, a chip designer can apply the inner settings to an external level of cache. Changing the settings in this way is a chip-specific implementation feature. For more information about this feature, read the chip-specific documentation.
Although using caches is good for performance, cache use reduces determinism of execution. For example, a read from memory that hits in the cache is resolved quickly, but a read that misses in the cache needs a cache linefill.
Usually, write-back cache policies are more efficient than "Write-Through" cache policies. This efficiency is because write-back cache policies reduce the number of slow and energy-consuming writes out to memory. However, write-back cache policies have an administrative overhead because of the need for software cache coherency management. Write-back cache policies also decrease determinism of execution, as mentioned previously. This reduction in determinism is caused by memory accesses causing cache line write-back (evictions) of dirty cache lines to make space for new cache linefills.
Although Write-Through cache policies use more energy and are slower due to the increased bus traffic, they have two advantages over using write-back cache policies:
Write-Through cache policies reduce the need for software cache maintenance.
Write-Through cache policies avoid the extra decrease in determinism because there is never any dirty data in the cache.
In safety-critical systems where the Cortex-M7 has been implemented with Error Correcting Code (ECC) on its caches, Write-Through is preferred. This preference is because ECC lets the processor recover good data from memory in case a line becomes corrupted in the cache. However, some versions of Cortex-M7 include an erratum which causes data corruption in Write-Through cacheable memory regions. These Write-Through cacheable memory regions include the following:
Memory regions that have the Write-Through cacheable property.
Memory regions that are converted to Write-Through cacheable by software programming the CM7_CACR register.
For more information on the erratum, read Version 7 .0 of the Cortex-M7 (AT610) and Cortex-M7 with FPU (AT611) Software Developer Errata Notice, published November 2018. The number and name of the erratum is 1259864, Data corruption in a sequence of Write-Through stores and loads.