You copied the Doc URL to your clipboard.

Optimization of loop termination in C code

Loops are a common construct in most programs. Because a significant amount of execution time is often spent in loops, it is worthwhile paying attention to time-critical loops.

The loop termination condition can cause significant overhead if written without caution. Where possible:

  • use simple termination conditions

  • write count-down-to-zero loops

  • use counters of type unsigned int

  • test for equality against zero.

Following any or all of these guidelines, separately or in combination, is likely to result in better code.

Table 8 shows two sample implementations of a routine to calculate n! that together illustrate loop termination overhead. The first implementation calculates n! using an incrementing loop, while the second routine calculates n! using a decrementing loop.

Table 8. C code for incrementing and decrementing loops
Incrementing loopDecrementing loop
int fact1(int n)
{
    int i, fact = 1;
    for (i = 1; i <= n; i++)
        fact *= i;
    return (fact);
}
int fact2(int n)
{
    unsigned int i, fact = 1;
    for (i = n; i != 0; i--)
        fact *= i;
    return (fact);
}

Table 9 shows the corresponding disassembly of the machine code produced by the compiler for each of the sample implementations of Table 8, where the C code for both implementations has been compiled using the options -O2 -Otime.

Table 9. C Disassembly for incrementing and decrementing loops
Incrementing loopDecrementing loop
fact1 PROC
    MOV      r2, r0
    MOV      r0, #1
    CMP      r2, #1
    MOV      r1, r0
    BXLT     lr
|L1.20|
    MUL      r0, r1, r0
    ADD      r1, r1, #1
    CMP      r1, r2
    BLE      |L1.20|
    BX       lr
    ENDP
fact2 PROC
    MOVS     r1, r0
    MOV      r0, #1
    BXEQ     lr
|L1.12|
    MUL      r0, r1, r0
    SUBS     r1, r1, #1
    BNE      |L1.12|
    BX       lr
ENDP

Comparing the disassemblies of Table 9 shows that the ADD and CMP instruction pair in the incrementing loop disassembly has been replaced with a single SUBS instruction in the decrementing loop disassembly. This is because a compare with zero can be used instead.

In addition to saving an instruction in the loop, the variable n does not have to be saved across the loop, so the use of a register is also saved in the decrementing loop disassembly. This eases register allocation. It is even more important if the original termination condition involves a function call. For example:

for (...; i < get_limit(); ...);

The technique of initializing the loop counter to the number of iterations required, and then decrementing down to zero, also applies to while and do statements.

See also