Advanced SIMD instructions flow through the ARM pipeline and then enter the NEON instruction queue between the ARM and NEON pipelines. Although an instruction in the NEON instruction queue is completed from the point of view of the ARM pipeline, the NEON unit must still decode and schedule the instruction. The NEON instruction queue is 16 entries deep. There is also an 12-entry NEON data queue that holds entries for Advanced SIMD load instructions.
As long as these queues are not full, the processor can continue to run and execute both ARM and Advanced SIMD instructions. When the Advanced SIMD instruction or data queue is full, the processor stalls execution of the next Advanced SIMD instruction until there is room for this instruction in the queues. In this manner, the cycle timing of Advanced SIMD instructions scheduled in the NEON engine can affect the overall timing of the instruction sequence, but only if there are enough Advanced SIMD instructions to fill the instruction or data queue.
When the processor is configured without NEON, all attempted Advanced SIMD and VFP instructions result in an Undefined Instruction exception.