DO loop unrolling is a standard "manual" optimization that creates more statements in a small loop by replication of the original statement. This is done automatically by KAP to speed up some scalar loops. Unrolling a loop involves duplicating the loop body one or more times within the loop, adding an increment (or changing the increment that was already on the DO statement), and possibly inserting code before or after the loop to execute the excess iterations of the loop (the cleanup code).
If the loop bounds are constant and the iteration count of the loop
is small, the loop may be entirely deleted and replaced by copies
of the loop body. If the loop bounds are constant and the iteration
count is a multiple of the unrolling factor, the cleanup code can be
omitted. If the loop bounds are constant and the iteration count is
evenly divisible by a number close to the unrolling factor given by
the /unroll
qualifier, KAP may use that number as the
unrolling factor to eliminate the cleanup loop. Only numbers within
25% of the given unrolling factor will be considered.
Inner-loop unrolling is controlled by the /unroll[2]
command qualifiers and directive. Outer-loop unrolling is part of
memory management and is controlled by /roundoff
and
/scalaropt
.
The following examples can be duplicated with either
/unroll=8
and /unroll2=1000
on the
command line or the directive C*$* unroll (8,1000)
.
If the loop bounds are variable, a loop might be unrolled as follows:
DO 10 I = 1,N A(I) = B(I)/A(I-1) 10 CONTINUE
Becomes:
DO 24 I=1,N-7,8 A(I) = B(I) / A(I-1) A(I+1) = B(I+1) / A(I) A(I+2) = B(I+2) / A(I+1) A(I+3) = B(I+3) / A(I+2) A(I+4) = B(I+4) / A(I+3) A(I+5) = B(I+5) / A(I+4) A(I+6) = B(I+6) / A(I+5) A(I+7) = B(I+7) / A(I+6) 24 CONTINUE DO 2 I=I,N,1 A(I) = B(I) / A(I-1) 2 CONTINUE
If loop bounds are constant, the unrolled loop might look like the following example. The unrolling factor has been modified for this loop to avoid inserting cleanup code:
DO 20 I=1,63 A(I) = B(I)/A(I-1) 20 CONTINUE
Becomes:
DO 25 I=1,55,9 A(I) = B(I) / A(I-1) A(I+1) = B(I+1) / A(I) A(I+2) = B(I+2) / A(I+1) A(I+3) = B(I+3) / A(I+2) A(I+4) = B(I+4) / A(I+3) A(I+5) = B(I+5) / A(I+4) A(I+6) = B(I+6) / A(I+5) A(I+7) = B(I+7) / A(I+6) A(I+8) = B(I+8) / A(I+7) 25 CONTINUE
Or, if the loop iteration count is constant and small, the loop may be removed altogether, for example:
DO 30 I=1,5 A(I) = B(I)/A(I-1) 30 CONTINUE
Becomes:
A(1) = B(1) / A(0) A(2) = B(2) / A(1) A(3) = B(3) / A(2) A(4) = B(4) / A(3) A(5) = B(5) / A(4)