8.4 Loop Unrolling

DO loop unrolling is a standard "manual" optimization that creates more statements in a small loop by replication of the original statement. This is done automatically by KAP to speed up some scalar loops. Unrolling a loop involves duplicating the loop body one or more times within the loop, adding an increment (or changing the increment that was already on the DO statement), and possibly inserting code before or after the loop to execute the excess iterations of the loop (the cleanup code).

If the loop bounds are constant and the iteration count of the loop is small, the loop may be entirely deleted and replaced by copies of the loop body. If the loop bounds are constant and the iteration count is a multiple of the unrolling factor, the cleanup code can be omitted. If the loop bounds are constant and the iteration count is evenly divisible by a number close to the unrolling factor given by the /unroll qualifier, KAP may use that number as the unrolling factor to eliminate the cleanup loop. Only numbers within 25% of the given unrolling factor will be considered.

Inner-loop unrolling is controlled by the /unroll[2] command qualifiers and directive. Outer-loop unrolling is part of memory management and is controlled by /roundoff and /scalaropt .

The following examples can be duplicated with either /unroll=8 and /unroll2=1000 on the command line or the directive C*$* unroll (8,1000) .

If the loop bounds are variable, a loop might be unrolled as follows:

DO 10 I = 1,N
A(I) = B(I)/A(I-1)
10  CONTINUE

Becomes:

DO 24 I=1,N-7,8
A(I) = B(I) / A(I-1)
A(I+1) = B(I+1) / A(I)
A(I+2) = B(I+2) / A(I+1)
A(I+3) = B(I+3) / A(I+2)
A(I+4) = B(I+4) / A(I+3)
A(I+5) = B(I+5) / A(I+4)
A(I+6) = B(I+6) / A(I+5)
A(I+7) = B(I+7) / A(I+6)
24  CONTINUE
DO 2 I=I,N,1
A(I) = B(I) / A(I-1)
2  CONTINUE

If loop bounds are constant, the unrolled loop might look like the following example. The unrolling factor has been modified for this loop to avoid inserting cleanup code:

DO 20 I=1,63
A(I) = B(I)/A(I-1)
20  CONTINUE

Becomes:

DO 25 I=1,55,9
A(I) = B(I) / A(I-1)
A(I+1) = B(I+1) / A(I)
A(I+2) = B(I+2) / A(I+1)
A(I+3) = B(I+3) / A(I+2)
A(I+4) = B(I+4) / A(I+3)
A(I+5) = B(I+5) / A(I+4)
A(I+6) = B(I+6) / A(I+5)
A(I+7) = B(I+7) / A(I+6)
A(I+8) = B(I+8) / A(I+7)
25  CONTINUE

Or, if the loop iteration count is constant and small, the loop may be removed altogether, for example:

DO 30 I=1,5
A(I) = B(I)/A(I-1)
30  CONTINUE

Becomes:

A(1) = B(1) / A(0)
A(2) = B(2) / A(1)
A(3) = B(3) / A(2)
A(4) = B(4) / A(3)
A(5) = B(5) / A(4)


Previous Page | Next Page | Contents | Index |
Command-Line Qualifiers