You copied the Doc URL to your clipboard.

#pragma unroll [(n)]

This pragma instructs the compiler to unroll a loop by n iterations.


Both vectorized and nonvectorized loops can be unrolled using #pragma unroll [(n)]. That is, #pragma unroll [(n)] applies to both --vectorize and --no_vectorize.


#pragma unroll
#pragma unroll (n)



is an optional value indicating the number of iterations to unroll.


If you do not specify a value for n, the compiler assumes #pragma unroll (4).


This pragma is only applicable if you are compiling with -O3 -Otime. When compiling with -O3 -Otime, the compiler automatically unrolls loops where it is beneficial to do so. You can use this pragma to ask the compiler to unroll a loop that has not been unrolled automatically.


Use this pragma only when you have evidence, for example from --diag_warning=optimizations, that the compiler is not unrolling loops optimally by itself.

You cannot determine whether this pragma is having any effect unless you compile with --diag_warning=optimizations or examine the generated assembly code, or both.


This pragma can only take effect when you compile with -O3 -Otime. Even then, the use of this pragma is a request to the compiler to unroll a loop that has not been unrolled automatically. It does not guarantee that the loop is unrolled.

#pragma unroll [(n)] can be used only immediately before a for loop, a while loop, or a do ... while loop.


void matrix_multiply(float ** __restrict dest, float ** __restrict src1,
    float ** __restrict src2, unsigned int n)
    unsigned int i, j, k;
    for (i = 0; i < n; i++)
        for (k = 0; k < n; k++)
            float sum = 0.0f;
            /* #pragma unroll */
            for(j = 0; j < n; j++)
                sum += src1[i][j] * src2[j][k];
            dest[i][k] = sum;

In this example, the compiler does not normally complete its loop analysis because src2 is indexed as src2[j][k] but the loops are nested in the opposite order, that is, with j inside k. When #pragma unroll is uncommented in the example, the compiler proceeds to unroll the loop four times.

If the intention is to multiply a matrix that is not a multiple of four in size, for example an n * n matrix, #pragma unroll (m) might be used instead, where m is some value so that n is an integral multiple of m.