The compiler can provide diagnostic information to indicate
where vectorization optimizations are successfully applied and where
it failed to apply vectorization. The command-line options that
provide this information are
Example 16 shows two functions that implement a simple sum operation on an array. This code does not vectorize.
produces an optimization warning message for the
Adding the __inline qualifier to the definition of
this code to vectorize but it is still not optimal. Using the
again produces optimization warning messages to indicate that the
loop vectorizes but there might be a potential pointer aliasing
The compiler must generate a runtime test for aliasing and output both vectorized and scalar copies of the code. Example 17 shows how this can be improved using the restrict keyword if you know that the pointers are not aliased.
The final improvement that can be made is to the number of loop iterations. In Example 17, the number of iterations is not fixed and might not be a multiple that can fit exactly into a NEON register. This means that the compiler must test for remaining iterations to execute using non vectored code. If you know that your iteration count is one of those supported by NEON, you can indicate this to the compiler. Example 18 shows the final improvement that can be made to obtain the best performance from vectorization.