You can avoid certain code constructs to maximize branch prediction performance. For example:
Using conditional Undefined instructions in normal code to enter the undefined handler as a means of doing emulation.
Coding more than two likely taken branches per fetch. This can only happen in Thumb state. Unless used as a jump table where each branch is its own basic block, use NOPs for padding.
Coding more than three branches per fetch that are likely to be executed in sequence.
In Thumb state, it is possible to pack four branches in a single fetch, for example, in a multiway branch:
This is a sequence of more than three branches with three conditional branches, and the fourth branch is likely to be reached. Avoid this kind of sequence, or use NOPs to break up the branch sequence.