You copied the Doc URL to your clipboard.

Arm Cortex-A55 Core Technical Reference Manual : Cache protection behavior

Cache protection behavior

The core protects against soft errors that result in a RAM bitcell temporarily holding the incorrect value.

The Cortex®-A55 core writes a new value to the RAM to correct the error. If the error is a hard error that is not corrected by writing to the RAM, for example a physical defect in the RAM, then the core might get into a livelock as it continually detects and then tries to correct the error.

Some RAMs have Single Error Detect (SED) capability, while others have Single Error Correct, Double Error Detect (SECDED) capability. The core can make progress and remain functionally correct when there is transient single bit error in any RAM. If there are multiple single bit errors in different RAMs, or within different protection granules within the same RAM, then the core also remains functionally correct. If there is a double bit error in a single RAM within the same protection granule, then the behavior depends on the RAM:

  • For RAMs with SECDED capability listed in the following table, the error is detected and reported as described in error reporting. If the error is in a cache line containing dirty data, then that data might be lost, resulting in data corruption
  • For RAMs with only SED, a double bit error is not detected and therefore might cause data corruption.

If there are three or more bit errors, then depending on the RAM and the position of the errors within the RAM, the errors might be detected or might not be detected.

The Cortex-A55 cache protection support has a minimal performance impact when no errors are present. When an error is detected, the access that caused the error is stalled while the correction takes place. When the correction is complete, the access either continues with the corrected data, or is retried. If the access is retried, it either hits in the cache again with the corrected data, or misses in the cache and re-fetches the data from a lower level cache or from main memory. The behavior for each RAM is shown in the following table.

Table A8-1 Cache protection behavior

RAM Protection type Protection granule Correction behavior
L1 instruction cache tag Parity, SED 31 bits Both lines in the cache set are invalidated, then the line requested is refetched from L2 or external memory.
L1 instruction cache data Parity, SED 20 bits Both lines in the cache set are invalidated, then the line requested is refetched from L2 or external memory.
L2 TLB tag Parity, SED 39 bits or 40 bits Entry invalidated, new pagewalk started to refetch it.
L2 TLB data Parity, SED 43 bits Entry invalidated, new pagewalk started to refetch it.
L1 data cache tag ECC, SECDED 32 bits Line cleaned and invalidated from L1. SCU duplicate tags are used to get the correct address. Line refetched from L2 or external memory, with single bit errors corrected as part of the eviction.
L1 data cache data ECC, SECDED 32 bits Line cleaned and invalidated from L1, with single bit errors corrected as part of the eviction. Line refetched from L2 or external memory.
L1 data cache dirty ECC, SECDED 2 bits Line cleaned and invalidated from L1, with single bit errors corrected as part of the eviction. Only the dirty bit is protected. The other bits are performance hints, therefore do not cause a functional failure if they are incorrect.
L2 cache tag ECC, SECDED 30, 31, or 32 bits depending on the cache size. Tag rewritten with correct value, access retried. If the error is uncorrectable then the tag is invalidated.
L2 cache victim None - The victim RAM is used only as a performance hint. It does not result in a functional failure if the contents are incorrect.
L2 cache data ECC, SECDED 64 bits Data is corrected inline, access might stall for an additional cycle or two while the correction takes place.
L2 data buffer ECC, SECDED 72 bits Data is corrected inline, access might stall for an additional cycle or two while the correction takes place.
Branch predictor None - The branch predictor RAMs are used only as a performance hint. They do not result in a functional failure if the contents are incorrect.

Note

When an ECC error occurs during a load instruction that takes multiple cycles to complete, for example LDM, the load instruction will re-execute. However, if a state change occurs between the original load instruction and the second attempt, then the second attempt will not execute. The first attempt could leave the register file in an inconsistent state, since the register file may have been updated for locations that did not have errors.

The following situations will cause a state change between the original instruction and the second attempt:

  • A hardware breakpoint, watchpoint, or vector catch has been set since the first execution that is triggered on re-execution.
  • The page tables have been modified since the first execution, resulting in an instruction or data abort trap being taken on re-execution.

In these situations, software may be able to observe that the original load instruction committed some new state despite not fully completing.