ERR<n>STATUS, Error Record Primary Status Register, n = 0 - 65534

The ERR<n>STATUS characteristics are:

Purpose

Contains status information for the error record, including:

Within this register:

Configuration

Some or all RW fields of this register have defined reset values.

This register is present only when RAS is implemented. Otherwise, direct accesses to ERR<n>STATUS are UNDEFINED.

The number of error records that are implemented is IMPLEMENTATION DEFINED.

If error record <n> is not implemented, ERR<n>STATUS is RES0.

ERR<q>FR describes the features implemented by the node that owns error record <n>. <q> is the index of the first error record owned by the same node as error record <n>. If the node owns a single record, then q = n.

Attributes

ERR<n>STATUS is a 64-bit register.

Field descriptions

The ERR<n>STATUS bit assignments are:

When ARMv8.4-RAS is implemented:

6362616059585756555453525150494847464544434241403938373635343332
00000000000000000000000000000000
AVVUEEROFMVCEDEPNUETCI000IERRSERR
313029282726252423222120191817161514131211109876543210

Bits [63:32]

Reserved, RES0.

AV, bit [31]

Address Valid.

AVMeaning
0b0

ERR<n>ADDR not valid.

0b1

ERR<n>ADDR contains an address associated with the highest priority error recorded by this record.

This bit is read/write-one-to-clear.

The following resets apply:

V, bit [30]

Status Register Valid.

VMeaning
0b0

ERR<n>STATUS not valid.

0b1

ERR<n>STATUS valid. At least one error has been recorded.

This bit is read/write-one-to-clear.

The following resets apply:

UE, bit [29]

Uncorrected error.

UEMeaning
0b0

No errors have been detected, or all detected errors have been either corrected or deferred.

0b1

At least one detected error was not corrected and not deferred.

When clearing ERR<n>STATUS.V to 0, if this bit is nonzero, then software must write one to this bit to clear this bit to zero.

This bit is not valid and reads UNKNOWN if ERR<n>STATUS.V is set to 0.

This bit is read/write-one-to-clear.

The following resets apply:

ER, bit [28]

Error Reported.

ERMeaning
0b0

No in-band error (External abort) reported.

0b1

An External abort was signaled by the node to the master making the access or other transaction. This can be because any of the following are true:

  • The applicable one of the ERR<q>CTLR.{WUE,RUE,UE} bits is implemented, and was set to 1 when an Uncorrected error was detected.

  • The applicable one of the ERR<q>CTLR.{WUE,RUE,UE} bits is not implemented.

It is IMPLEMENTATION DEFINED whether this bit can be set to 1 by a Deferred error.

When clearing ERR<n>STATUS.V to 0, if this bit is nonzero, then software must write one to this bit to clear this bit to zero.

This bit is not valid and reads UNKNOWN if any of the following are true:

This bit is read/write-one-to-clear.

Note

An External abort signaled by the node might be masked and not generate any exception.

The following resets apply:

OF, bit [27]

Overflow.

Indicates that multiple errors have been detected. This bit is set to 1 when one of the following occurs:

Otherwise, this bit is unchanged when an error is recorded.

If a Corrected error counter is implemented:

OFMeaning
0b0

Since this bit was last cleared to zero, no error syndrome has been discarded and, if a Corrected error counter is implemented, it has not overflowed.

0b1

Since this bit was last cleared to zero, at least one error syndrome has been discarded or, if a Corrected error counter is implemented, it might have overflowed.

If this bit is nonzero, then software must write 1 to this bit, to clear this bit to zero, when clearing ERR<n>STATUS.V to 0.

This bit is not valid and reads UNKNOWN if ERR<n>STATUS.V is set to 0.

This bit is read/write-one-to-clear.

The following resets apply:

MV, bit [26]

Miscellaneous Registers Valid.

MVMeaning
0b0

ERR<n>MISC0, ERR<n>MISC1, ERR<n>MISC2, and ERR<n>MISC3 are not valid.

0b1

The IMPLEMENTATION DEFINED contents of the ERR<n>MISC0, ERR<n>MISC1, ERR<n>MISC2, and ERR<n>MISC3 registers contain additional information for an error recorded by this record.

This bit is read/write-one-to-clear.

Note

If the ERR<n>MISC0, ERR<n>MISC1, ERR<n>MISC2, and ERR<n>MISC3 registers can contain additional information for a previously recorded error, then the contents must be self-describing to software or a user. For example, certain fields might relate only to Corrected errors, and other fields only to the most recent error that was not discarded.

The following resets apply:

CE, bits [25:24]

Corrected Error.

CEMeaning
0b00

No errors were corrected.

0b01

At least one transient error was corrected.

0b10

At least one error was corrected.

0b11

At least one persistent error was corrected.

The mechanism by which a node detects whether a correctable error is transient or persistent is IMPLEMENTATION DEFINED. If no such mechanism is implemented, then the node sets this field to 0b10 when an error is corrected.

When clearing ERR<n>STATUS.V to 0, if this field is nonzero, then software must write ones to this field to clear this field to zero.

If ERR<n>STATUS.V is set to 0, this field is not valid and reads UNKNOWN.

This field is read/write-one-to-clear. Writing a value other than all-zeros or all-ones sets this field to an UNKNOWN value.

The following resets apply:

DE, bit [23]

Deferred Error.

DEMeaning
0b0

No errors were deferred.

0b1

At least one error was not corrected and deferred.

Support for deferring errors is IMPLEMENTATION DEFINED.

When clearing ERR<n>STATUS.V to 0, if this bit is nonzero, then software must write 1 to this bit to clear this bit to zero.

If ERR<n>STATUS.V is set to 0, this bit is not valid and reads UNKNOWN.

This bit is read/write-one-to-clear.

The following resets apply:

PN, bit [22]

Poison.

PNMeaning
0b0

Uncorrected error or Deferred error recorded because a corrupt value was detected, for example, by an error detection code (EDC).

Note

If a producer node detects a corrupt value and defers the error by producing a poison value, then this bit is set to 0 at the producer node.

0b1

Uncorrected error or Deferred error recorded because a poison value was detected.

Note

This might only be an indication of poison, because, in some EDC schemes, a poison value is encoded as an unlikely form of corrupt data, meaning it is possible to mistake a corrupt value as a poison value.

It is IMPLEMENTATION DEFINED whether a node can distinguish a poison value from a corrupt value.

When clearing ERR<n>STATUS.V to 0, if this bit is nonzero, then software must write 1 to this bit to clear this bit to zero.

This bit is not valid and reads UNKNOWN if any of the following are true:

This bit is read/write-one-to-clear.

The following resets apply:

UET, bits [21:20]

Uncorrected Error Type.

Describes the state of the component after detecting or consuming an Uncorrected error.

UETMeaning
0b00

Uncorrected error, Uncontainable error (UC).

0b01

Uncorrected error, Unrecoverable error (UEU).

0b10

Uncorrected error, Latent or Restartable error (UEO).

0b11

Uncorrected error, Signaled or Recoverable error (UER).

When clearing ERR<n>STATUS.V to 0, if this field is nonzero, then software must write ones to this field to clear this field to zero.

This field is not valid and reads UNKNOWN if any of the following are true:

This field is read/write-one-to-clear. Writing a value other than all-zeros or all-ones sets this field to an UNKNOWN value.

The following resets apply:

CI, bit [19]

Critical error.

Indicates whether a critical error condition has been recorded.

CIMeaning
0b0

No critical error condition recorded.

0b1

Critical error condition recorded.

When clearing ERR<n>STATUS.V to 0, if this bit is nonzero, then software must write 1 to this bit to clear this bit to zero.

This bit is not valid and reads UNKNOWN if ERR<n>STATUS.V is set to 0.

This bit is read/write-one-to-clear.

The following resets apply:

Bits [18:16]

Reserved, RES0.

IERR, bits [15:8]

IMPLEMENTATION DEFINED error code.

Used with any primary error code SERR value. Further IMPLEMENTATION DEFINED information can be placed in the MISC registers.

This field is not valid and reads UNKNOWN if ERR<n>STATUS.V is set to 0.

The following resets apply:

SERR, bits [7:0]

Architecturally-defined primary error code.

Indicates the type of error. The primary error code might be used by a fault handling agent to triage an error without requiring device-specific code. For example, to count and threshold corrected errors in software, or generate a short log entry.

SERRMeaning
0x00

No error.

0x01

IMPLEMENTATION DEFINED error.

0x02

Data value from (non-associative) internal memory. For example, ECC from on-chip SRAM or buffer.

0x03

IMPLEMENTATION DEFINED pin. For example, nSEI pin.

0x04

Assertion failure. For example, consistency failure.

0x05

Error detected on internal data path. For example, parity on ALU result.

0x06

Data value from associative memory. For example, ECC error on cache data.

0x07

Address/control value from associative memory. For example, ECC error on cache tag.

0x08

Data value from a TLB. For example, ECC error on TLB data.

0x09

Address/control value from a TLB. For example, ECC error on TLB tag.

0x0A

Data value from producer. For example, parity error on write data bus.

0x0B

Address/control value from producer. For example, parity error on address bus.

0x0C

Data value from (non-associative) external memory. For example, ECC error in SDRAM.

0x0D

Illegal address (software fault). For example, access to unpopulated memory.

0x0E

Illegal access (software fault). For example, byte write to word register.

0x0F

Illegal state (software fault). For example, device not ready.

0x10

Internal data register. For example, parity on a SIMD&FP register. For a PE, all general-purpose, stack pointer, and SIMD&FP registers are data registers.

0x11

Internal control register. For example, Parity on a System register. For a PE, all registers other than general-purpose, stack pointer, and SIMD&FP registers are control registers.

0x12

Error response from slave. For example, error response from cache write-back.

0x13

External timeout. For example, timeout on interaction with another node.

0x14

Internal timeout. For example, timeout on interface within the node.

0x15

Deferred error from slave not supported at master. For example, poisoned data received from a slave by a master that cannot defer the error further.

All other values are reserved. Reserved values might be defined in a future version of the architecture.

This field is not valid and reads UNKNOWN if ERR<n>STATUS.V is set to 0.

The following resets apply:

Otherwise:

6362616059585756555453525150494847464544434241403938373635343332
00000000000000000000000000000000
AVVUEEROFMVCEDEPNUET0000IERRSERR
313029282726252423222120191817161514131211109876543210

Bits [63:32]

Reserved, RES0.

AV, bit [31]

Address Valid.

AVMeaning
0b0

ERR<n>ADDR not valid.

0b1

ERR<n>ADDR contains an address associated with the highest priority error recorded by this record.

This bit ignores writes if any of ERR<n>STATUS.{CE, DE, UE} are set to 1, and the highest priority of these is not being cleared to 0 in the same write.

This bit is read/write-one-to-clear.

The following resets apply:

V, bit [30]

Status Register Valid.

VMeaning
0b0

ERR<n>STATUS not valid.

0b1

ERR<n>STATUS valid. At least one error has been recorded.

This bit ignores writes if any of ERR<n>STATUS.{CE, DE, UE} are set to 1, and and is not being cleared to 0 in the same write.

This bit is read/write-one-to-clear.

The following resets apply:

UE, bit [29]

Uncorrected error.

UEMeaning
0b0

No errors have been detected, or all detected errors have been either corrected or deferred.

0b1

At least one detected error was not corrected and not deferred.

When clearing ERR<n>STATUS.V to 0, if this bit is nonzero, then software must write one to this bit to clear this bit to zero.

If ERR<n>STATUS.OF is set to 1 and is not being cleared to 0 in the same write, this bit ignores writes.

This bit is not valid and reads UNKNOWN if ERR<n>STATUS.V is set to 0.

This bit is read/write-one-to-clear.

The following resets apply:

ER, bit [28]

Error Reported.

ERMeaning
0b0

No in-band error (External abort) reported.

0b1

An External abort was signaled by the node to the master making the access or other transaction. This can be because any of the following are true:

  • The applicable one of the ERR<q>CTLR.{WUE,RUE,UE} bits is implemented, and was set to 1 when an Uncorrected error was detected.

  • The applicable one of the ERR<q>CTLR.{WUE,RUE,UE} bits is not implemented.

It is IMPLEMENTATION DEFINED whether this bit can be set to 1 by a Deferred error.

If this bit is nonzero, then software must write 1 to this bit, to clear this bit to zero, when:

This bit is not valid and reads UNKNOWN if any of the following are true:

This bit ignores writes if any of ERR<n>STATUS.{CE, DE, UE} are set to 1, and the highest priority of these is not being cleared to 0 in the same write.

This bit is read/write-one-to-clear.

Note

An External abort signaled by the node might be masked and not generate any exception.

The following resets apply:

OF, bit [27]

Overflow.

Indicates that multiple errors have been detected. This bit is set to 1 when one of the following occurs:

It is IMPLEMENTATION DEFINED whether this bit is set to 1 when one of the following occurs:

It is IMPLEMENTATION DEFINED whether this bit is set to 0 when one of the following occurs:

The IMPLEMENTATION DEFINED clearing of this bit might also depend on the value of the other error status bits.

If a Corrected error counter is implemented:

OFMeaning
0b0

If ERR<n>STATUS.UE == 1, then no error syndrome for an Uncorrected error has been discarded.

If ERR<n>STATUS.UE == 0 and ERR<n>STATUS.DE == 1, then no error syndrome for a Deferred error has been discarded.

If ERR<n>STATUS.UE == 0, ERR<n>STATUS.DE == 0, and a Corrected error counter is implemented, then the counter has not overflowed.

If ERR<n>STATUS.UE == 0, ERR<n>STATUS.DE == 0, ERR<n>STATUS.CE != 0b00, and no Corrected error counter is implemented, then no error syndrome for a Corrected error has been discarded.

Note

This bit might have been set to 1 when an error syndrome was discarded and later cleared to 0 when a higher priority syndrome was recorded.

0b1

At least one error syndrome has been discarded or, if a Corrected error counter is implemented, it might have overflowed.

If this bit is nonzero, then software must write 1 to this bit, to clear this bit to zero, when clearing ERR<n>STATUS.V to 0.

This bit is not valid and reads UNKNOWN if ERR<n>STATUS.V is set to 0.

This bit is read/write-one-to-clear.

The following resets apply:

MV, bit [26]

Miscellaneous Registers Valid.

MVMeaning
0b0

ERR<n>MISC0 and ERR<n>MISC1 not valid.

0b1

The IMPLEMENTATION DEFINED contents of the ERR<n>MISC0 and ERR<n>MISC1 registers contains additional information for an error recorded by this record.

This bit ignores writes if any of ERR<n>STATUS.{CE, DE, UE} are set to 1, and the highest priority of these is not being cleared to 0 in the same write.

This bit is read/write-one-to-clear.

Note

If the ERR<n>MISC0 and ERR<n>MISC1 registers can contain additional information for a previously recorded error, then the contents must be self-describing to software or a user. For example, certain fields might relate only to Corrected errors, and other fields only to the most recent error that was not discarded.

The following resets apply:

CE, bits [25:24]

Corrected Error.

CEMeaning
0b00

No errors were corrected.

0b01

At least one transient error was corrected.

0b10

At least one error was corrected.

0b11

At least one persistent error was corrected.

The mechanism by which a node detects whether a correctable error is transient or persistent is IMPLEMENTATION DEFINED. If no such mechanism is implemented, then the node sets this field to 0b10 when an error is corrected.

When clearing ERR<n>STATUS.V to 0, if this field is nonzero, then software must write ones to this field to clear this field to zero.

If ERR<n>STATUS.OF is set to 1 and is not being cleared to 0 in the same write, this field ignores writes.

If ERR<n>STATUS.V is set to 0, this field is not valid and reads UNKNOWN.

This field is read/write-ones-to-clear. Writing a value other than all-zeros or all-ones sets this field to an UNKNOWN value.

The following resets apply:

DE, bit [23]

Deferred Error.

DEMeaning
0b0

No errors were deferred.

0b1

At least one error was not corrected and deferred.

Support for deferring errors is IMPLEMENTATION DEFINED.

When clearing ERR<n>STATUS.V to 0, if this bit is nonzero, then software must write 1 to this bit to clear this bit to zero.

If ERR<n>STATUS.OF is set to 1 and is not being cleared to 0 in the same write, this bit ignores writes.

If ERR<n>STATUS.V is set to 0, this bit is not valid and reads UNKNOWN.

This bit is read/write-one-to-clear.

The following resets apply:

PN, bit [22]

Poison.

PNMeaning
0b0

Uncorrected error or Deferred error recorded because a corrupt value was detected, for example, by an error detection code (EDC).

Note

If a producer node detects a corrupt value and defers the error by producing a poison value, then this bit is set to 0 at the producer node.

0b1

Uncorrected error or Deferred error recorded because a poison value was detected.

Note

This might only be an indication of poison, because, in some EDC schemes, a poison value is encoded as an unlikely form of corrupt data, meaning it is possible to mistake a corrupt value as a poison value.

It is IMPLEMENTATION DEFINED whether a node can distinguish a poison value from a corrupt value.

When clearing ERR<n>STATUS.V to 0, if this bit is nonzero, then software must write 1 to this bit to clear this bit to zero.

When clearing both ERR<n>STATUS.{DE, UE} to 0, if this bit is nonzero, then software must write 1 to this bit to clear this bit to zero.

This bit is not valid and reads UNKNOWN if any of the following are true:

When any of ERR<n>STATUS.{CE, DE, UE} are set to 1, and the highest priority of these is not being cleared to 0 in the same write, this bit ignores writes.

This bit is read/write-one-to-clear.

The following resets apply:

UET, bits [21:20]

Uncorrected Error Type.

Describes the state of the component after detecting or consuming an Uncorrected error.

UETMeaning
0b00

Uncorrected error, Uncontainable error (UC).

0b01

Uncorrected error, Unrecoverable error (UEU).

0b10

Uncorrected error, Latent or Restartable error (UEO).

0b11

Uncorrected error, Signaled or Recoverable error (UER).

When clearing ERR<n>STATUS.V to 0, if this field is nonzero, then software must write ones to this field to clear this field to zero.

When clearing ERR<n>STATUS.UE to 0, if this field is nonzero, then software must write ones to this field to clear this field to zero.

This field is not valid and reads UNKNOWN if any of the following are true:

When any of ERR<n>STATUS.{CE, DE, UE} are set to 1, and the highest priority of these is not being cleared to 0 in the same write, this field ignores writes.

This field is read/write-ones-to-clear. Writing a value other than all-zeros or all-ones sets this field to an UNKNOWN value.

The following resets apply:

Bits [19:16]

Reserved, RES0.

IERR, bits [15:8]

IMPLEMENTATION DEFINED error code.

Used with any primary error code SERR value. Further IMPLEMENTATION DEFINED information can be placed in the MISC registers.

This field is not valid and reads UNKNOWN if ERR<n>STATUS.V is set to 0.

When any of ERR<n>STATUS.{CE, DE, UE} are set to 1, and the highest priority of these is not being cleared to 0 in the same write, this field ignores writes.

The following resets apply:

SERR, bits [7:0]

Architecturally-defined primary error code.

Indicates the type of error. The primary error code might be used by a fault handling agent to triage an error without requiring device-specific code. For example, to count and threshold corrected errors in software, or generate a short log entry.

SERRMeaning
0x00

No error.

0x01

IMPLEMENTATION DEFINED error.

0x02

Data value from (non-associative) internal memory. For example, ECC from on-chip SRAM or buffer.

0x03

IMPLEMENTATION DEFINED pin. For example, nSEI pin.

0x04

Assertion failure. For example, consistency failure.

0x05

Error detected on internal data path. For example, parity on ALU result.

0x06

Data value from associative memory. For example, ECC error on cache data.

0x07

Address/control value from associative memory. For example, ECC error on cache tag.

0x08

Data value from a TLB. For example, ECC error on TLB data.

0x09

Address/control value from a TLB. For example, ECC error on TLB tag.

0x0A

Data value from producer. For example, parity error on write data bus.

0x0B

Address/control value from producer. For example, parity error on address bus.

0x0C

Data value from (non-associative) external memory. For example, ECC error in SDRAM.

0x0D

Illegal address (software fault). For example, access to unpopulated memory.

0x0E

Illegal access (software fault). For example, byte write to word register.

0x0F

Illegal state (software fault). For example, device not ready.

0x10

Internal data register. For example, parity on a SIMD&FP register. For a PE, all general-purpose, stack pointer, and SIMD&FP registers are data registers.

0x11

Internal control register. For example, Parity on a System register. For a PE, all registers other than general-purpose, stack pointer, and SIMD&FP registers are control registers.

0x12

Error response from slave. For example, error response from cache write-back.

0x13

External timeout. For example, timeout on interaction with another node.

0x14

Internal timeout. For example, timeout on interface within the node.

0x15

Deferred error from slave not supported at master. For example, poisoned data received from a slave by a master that cannot defer the error further.

All other values are reserved. Reserved values might be defined in a future version of the architecture.

This field is not valid and reads UNKNOWN if ERR<n>STATUS.V is set to 0.

When any of ERR<n>STATUS.{CE, DE, UE} are set to 1, and the highest priority of these is not being cleared to 0 in the same write, this field ignores writes.

The following resets apply:

Accessing the ERR<n>STATUS

After reading the status register, software must clear the valid bits to allow new errors to be recorded.

Between reading the register and clearing the valid bits, a new error might have overwritten the register. To prevent this new error being lost:

Software must write ones to the {ER, PN, UET, CI} fields when clearing ERR<n>STATUS.{V, UE, OF, CE, DE}.

ERR<n>STATUS can be accessed through the memory-mapped interfaces:

ComponentOffsetInstance
RAS0x010 + 64nERR<n>STATUS

Access on this interface is RW.




13/12/2018 16:42; 6379d01c197f1d40720d32d0f84c419c9187c009

Copyright © 2010-2018 Arm Limited or its affiliates. All rights reserved. This document is Non-Confidential.