Uncorrected errors and data poisoning
When an error is detected, the correction mechanism is triggered. However, if the error is a 2-bit error in a RAM protected by ECC, then the error is not correctable.
The behavior on an uncorrected error depends on the type of RAM.
Uncorrected error detected in a data RAM
When an uncorrected error is detected in a data RAM:
- The chunk of data with the error is marked as poisoned. This poison information is then transferred with the data and stored in the cache if the data is allocated back into a cache. The poisoned data is stored per 64 bits of data, except in the L1 data cache where it is stored per 32 bits of data.
If the interconnect supports poisoning, then the poison is passed along with the data when the line is evicted from the cluster. No abort is generated when a line is poisoned, as the abort can be deferred until the point when the poisoned data is consumed by a load or instruction fetch.
Uncorrected error detected in a tag or dirty RAM
When an uncorrected error is detected in a tag RAM or dirty RAM, either the address or coherency state of the line is not known anymore, and the data cannot be poisoned. In this case, the line is invalidated and an interrupt is generated to notify software that data has potentially been lost.