You copied the Doc URL to your clipboard.

Why do different cores behave differently when executing a WFE instruction?

Information in this article applies to:

  • Cortex-A72

  • Cortex-A53

Question

Why do different cores behave differently when executing a WFE instruction?

Answer

Suppose you have the following execution sequence:

images/download/attachments/103489537/execution_sequence.jpg

Figure 1 Execution sequence

The first Wait For Event (WFE) instruction puts a core in WFE state. However, the second WFE does not necessarily put the core in WFE state. Whether the second WFE puts the core in WFE state depends on the implementation of the core you use. For example:

  • For cores such as Cortex-A72, after the Send Event (SEV) instruction is executed, the second WFE puts the core in WFE state.

  • For cores such as Cortex-A53, the Event Register is set after the SEV instruction is executed. The second WFE clears the Event register, so you need one more WFE to put the core in WFE state.

Cortex-A72 and Cortex-A53 handle WFE instructions differently, and the following text from ARMv8 architecture explains the difference:

The WaitForEvent() pseudocode procedure optionally suspends execution until a WFE wake-up event or reset occurs, or until some earlier time if the implementation chooses. It is IMPLEMENTATION DEFINED whether restarting execution after the period of suspension causes a ClearEventRegister() to occur.

For Cortex-A72, the Event Register is cleared after the core restarts from the first WFE. That is, for Cortex-A72, after the wake-up event sets the Event Register, the ClearEventRegister() clears the Event Register when restarting from the WFE state. Therefore, for Cortex-A72, after waking up from WFE state, only one WFE is needed to put the core in WFE state again.

The following waveform for Cortex-A72 shows that the ClearEventRegister() clears the Event Register when restarting from WFE state:

images/download/attachments/103489537/maia_waveform.jpg

Figure 2 Cortex-A72 waveform

Although different processors might handle WFE differently, you can use the following standard code of Linux to avoid such problems:

    #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
    static inline void arch_spin_lock(arch_spinlock_t *lock)
    {
               unsigned int tmp;
               asm volatile(
               "       sevl\n"
               "1:     wfe\n"
               "2:     ldaxr    %w0, %1\n"
                "      cbnz     %w0, 1b\n"
                "      stxr     %w0, %w2, %1\n"
                "      cbnz     %w0, 2b\n"
                : "=&r" (tmp), "+Q" (lock->lock)
                : "r" (1)
                : "cc", "memory");
   }
	

Related information

N/A

Was this page helpful? Yes No