You copied the Doc URL to your clipboard.

# Optimization of loop termination in C code

Loops are a common construct in most programs. Because a significant amount of execution time is often spent in loops, it is worthwhile paying attention to time-critical loops.

The loop termination condition can cause significant overhead if written without caution. Where possible:

• Use simple termination conditions.

• Write count-down-to-zero loops.

• Use counters of type `unsigned int`.

• Test for equality against zero.

Following any or all of these guidelines, separately or in combination, is likely to result in better code.

The following table shows two sample implementations of a routine to calculate `n!` that together illustrate loop termination overhead. The first implementation calculates `n!` using an incrementing loop, while the second routine calculates `n!` using a decrementing loop.

Table 7-1 C code for incrementing and decrementing loops

Incrementing loop Decrementing loop
```int fact1(int n)
{
int i, fact = 1;
for (i = 1; i <= n; i++)
fact *= i;
return (fact);
}
```
```int fact2(int n)
{
unsigned int i, fact = 1;
for (i = n; i != 0; i--)
fact *= i;
return (fact);
}
```

The following table shows the corresponding disassembly of the machine code produced by `armclang -Os -S --target=arm-arm-none-eabi -march=armv8-a` for each of the sample implementations above.

Table 7-2 C disassembly for incrementing and decrementing loops

Incrementing loop Decrementing loop
```fact1:
mov     r1, r0
mov     r0, #1
cmp     r1, #1
bxlt    lr
mov     r2, #0
.LBB0_1:
add     r2, r2, #1
mul     r0, r0, r2
cmp     r1, r2
bne     .LBB0_1
bx      lr```
```fact2:
mov     r1, r0
mov     r0, #1
cmp     r1, #0
bxeq    lr
.LBB1_1:
mul     r0, r0, r1
subs    r1, r1, #1
bne     .LBB1_1
bx      lr
```

Comparing the disassemblies shows that the `ADD` and `CMP` instruction pair in the incrementing loop disassembly has been replaced with a single `SUBS` instruction in the decrementing loop disassembly. Because the `SUBS` instruction updates the status flags, including the Z flag, there is no requirement for an explicit `CMP r1,r2` instruction.

In addition to saving an instruction in the loop, the variable `n` does not have to be available for the lifetime of the loop, reducing the number of registers that have to be maintained. This eases register allocation. It is even more important if the original termination condition involves a function call. For example:

```for (...; i < get_limit(); ...);
```

The technique of initializing the loop counter to the number of iterations required, and then decrementing down to zero, also applies to `while` and `do` statements.