The majority of the ThumbEE instruction set is identical in both encodings and behavior to the Thumb-2 instruction set and therefore the cycle timings are also identical to the Thumb-2 instruction timings. The behavior of some instructions are different when executed in ThumbEE state instead of in Thumb state. However, the behavior changes for these instructions do not result in any changes to their cycle timing. The only additional cycle timing information for ThumbEE is for the new instructions.
Table 16.17 shows the timing operation of the new ThumbEE instructions.
 This instruction waits for all outstanding instructions to complete and then issues.
 If CHKA fails the array bounds check, then an exception is taken. Otherwise, this is a single cycle instruction.
 This instruction is predicted and behaves as a direct branch, B instruction.
 This instruction is predicted and behaves as a direct branch and link, BL instruction.
 Timing is identical to similar load instructions.
 Timing is identical to similar store instructions.
All loads and stores in ThumbEE state have the additional functionality of checking the base register for a zero value. If the base register is zero, then the processor performs a branch to the address [HandlerBase – 4]. See the ARM Architecture Reference Manual for more information.
The processor handles this scenario in the same way as to an exception such as a data abort because it does not occur in the common case. If the base register is zero, the processor flushes the pipeline and branches to the correct address. The additional cycle time penalty for this is variable in length, but is at least 13 cycles. The CHKA instruction uses the same mechanism when the array bounds check fails. This is also a rare occurrence and therefore is not optimized for performance.
All ThumbEE branch type instructions are predicted in ThumbEE
state in the same manner that they are predicted in ARM or Thumb
state. In addition, the handler base branch instructions,
are also predicted using the same branch prediction hardware used
for direct branch and branch link,
respectively. Because the ThumbEE instruction set uses R9 as the
base register rather than R13 as a stack pointer,
that read or write to the PC are written onto the return stack to
aid in the prediction of these indirect branches. The usage model
of the return stack in ThumbEE state, using R9 as the stack pointer,
is identical to the usage model in ARM and Thumb state, using R13
as the stack pointer.