The Cortex-A53 processor protects against soft errors that result in a RAM bitcell temporarily holding the incorrect value. The processor writes a new value to the RAM to correct the error. If the error is a hard error, that is not corrected by writing to the RAM, for example a physical defect in the RAM, then the processor might get into a livelock as it continually detects and then tries to correct the error.
Some RAMs have Single Error Detect (SED) capability, while others have Single Error Correct, Double Error Detect (SECDED) capability. The L1 data cache dirty RAM is Single Error Detect, Single Error Correct (SEDSEC). The processor can make progress and remain functionally correct when there is a single bit error in any RAM. If there are multiple single bit errors in different RAMs, or within different protection granules within the same RAM, then the processor also remains functionally correct. If there is a double bit error in a single RAM within the same protection granule, then the behavior depends on the RAM:
For RAMs with SECDED capability listed in Table 8.1, the error is detected and reported as described in Error reporting. If the error is in a cache line containing dirty data, then that data might be lost, resulting in data corruption.
For RAMs with only SED, a double bit error is not detected and therefore might cause data corruption.
If there are three or more bit errors, then depending on the RAM and the position of the errors within the RAM, the errors might be detected or might not be detected.
The Cortex-A53 CPU cache protection support has a minimal performance impact when no errors are present. When an error is detected, the access that caused the error is stalled while the correction takes place. When the correction is complete, the access either continues with the corrected data, or is retried. If the access is retried, it either hits in the cache again with the corrected data, or misses in the cache and re-fetches the data from a lower level cache or from main memory. The behavior for each RAM is shown in Table 8.1.
|RAM||Protection type||Configuration option||Protection granule||Correction behavior|
|L1 I-cache tag||Parity, SED||CPU_CACHE_PROTECTION||31 bits||Both lines in the cache set are invalidated, then the line requested is refetched from L2 or external memory.|
|L1 I-cache data||Parity, SED||CPU_CACHE_PROTECTION||20 bits||Both lines in the cache set are invalidated, then the line requested is refetched from L2 or external memory.|
|TLB||Parity, SED||CPU_CACHE_PROTECTION||31 bits or 52 bits||Entry invalidated, new pagewalk started to refetch it.|
|L1 D-cache tag||Parity, SED||CPU_CACHE_PROTECTION||32 bits||Line cleaned and invalidated from L1. SCU duplicate tags are used to get the correct address. Line refetched from L2 or external memory.|
|L1 D-cache data||ECC, SECDED||CPU_CACHE_PROTECTION||32 bits||Line cleaned and invalidated from L1, with single bit errors corrected as part of the eviction. Line refetched from L2 or external memory.|
|L1 D-cache dirty||Parity, SEDSEC||CPU_CACHE_PROTECTION||1 bit||Line cleaned and invalidated from L1, with single bit errors corrected as part of the eviction. Only the dirty bit is protected. The other bits are performance hints, therefore do not cause a functional failure if they are incorrect.|
|SCU L1 duplicate tag||ECC, SECDED||CPU_CACHE_PROTECTION||33 bits||Tag rewritten with correct value, access retried. If the error is uncorrectable then the tag is invalidated.|
|L2 tag||ECC, SECDED||SCU_CACHE_PROTECTION||33 bits||Tag rewritten with correct value, access retried. If the error is uncorrectable then the tag is invalidated.|
|L2 victim||None||-||-||The victim RAM is used only as a performance hint. It does not result in a functional failure if the contents are incorrect.|
|L2 data||ECC, SECDED||SCU_CACHE_PROTECTION||64 bits||Data is corrected inline, access might stall for an additional cycle or two while the correction takes place. After correction, the line might be evicted from the processor.|
|Branch predictor||None||-||-||The branch predictor RAMs are used only as a performance hint. They do not result in a functional failure if the contents are incorrect.|
If a correctable ECC error occurs after the first data cache
access of a load instruction that takes multiple cycles to complete,
LDM, and one of the following conditions
has taken place:
A hardware breakpoint, watchpoint or vector catch has been set since the first execution that is triggered on re-execution.
The page tables have been modified since the first execution, resulting in an instruction or data abort trap being taken on re-execution.
The register file is updated with data that was successfully read, before the correctable ECC error occurred.