You copied the Doc URL to your clipboard.

16.1. About instruction cycle timing

This chapter provides the information to estimate how much execution time particular code sequences require. The complexity of the processor makes it impossible to guarantee precise timing information with hand calculations. The timing of an instruction is often affected by other concurrent instructions, memory system activity, and additional events outside the instruction flow. Describing all possible instruction interactions and all possible events taking place in the processor is beyond the scope of this document. Only a cycle-accurate model of the processor can produce precise timings for a particular instruction sequence.

This chapter provides a framework for doing basic timing estimations for instruction sequences. The framework requires three main information components:

Instruction-specific scheduling information

This includes the number of micro-operations for each main instruction and the source and destination requirements for each micro-operation. The processor can issue a series of micro-operations to the execution pipeline for each ARM instruction executed. Most ARM instructions execute only one micro-operation. More complex ARM instructions such as load multiples can consist of several micro-operations.

Dual issue restriction criteria

This is the set of rules used to govern which instruction types can dual issue and under what conditions. This information is provided for dual issue of ARM instructions and Advanced SIMD instructions.

Other pipeline-dependent latencies

In addition to the time taken for the scheduling and issuing of instructions, there are other sources of latencies that effect the time of a program sequence. The two most common examples are a branch mispredict and a memory system stall such as a data cache miss of a load instruction. These cases are the most difficult to predict and often must be ignored or estimated using statistical analysis techniques. Fortunately, you can ignore most of these additional latencies when creating an optimal hand scheduling for a code sequence. Hand scheduling is the most useful application of this cycle timing information.