Optimizing Large Programs with KAP

2.9.2 Optimizing Large Programs with KAP

Follow these guidelines to optimize large programs:

Compile the program without KAP, with minimum compiler optimization, and with all compiler run-time checks enabled. Note the execution time and verify the results. If the program fails at this step, there is not much optimization you can do.
Some older programs use standard-violating techniques that KAP will not transform safely. If KAP fails because of this problem, there is little optimization you can do.
If you have the time and you know what the program is supposed to do, you can try to isolate the incorrect code, correct it, and proceed. This action is feasible for large programs only if the problems are easily understood and isolated or if you have enough time to find more intractable problems.
If the problem code is isolated and runs without KAP optimization, you may be able to run KAP on the rest of the program and leave out any problematic sections. You can also refer to Section 2.13 on KAP problems. You may be able to diagnose and correct some problems, and then run KAP on your program successfully.
Compile without KAP but with maximum compiler optimization. Note the execution time and verify the results. If the program fails, reduce compiler optimization and try again.
Compile the fastest/best non-KAP run and run it again with profiling enabled (for example, gprof ) to identify the program units that take the most time to run.
If some time-intensive units have many iterative loops and arrays, then those units are good candidates for KAP loop optimizations. Go to step 4.
If these units are not good candidates, then the lower-payoff optimizations, such as inlining, may provide some performance improvement especially if there are places where inlining inside loop nests may also allow KAP to perform vectorization optimizations. In this case, go to step 6.
If time-intensive routines were identified as good candidates, run KAP on them with modest KAP optimization (/optimize=2 ), compile the whole program with the other qualifiers used in the best run from step 2, note the execution time, and verify the results.
If the program fails, try again with the KAP qualifier /roundoff=0 . If that works, the failure is probably due to roundoff-sensitive operation. If it still fails with /roundoff=0 , try /scalaropt=1 .
If step 4 works, repeat with full KAP optimization, with full compiler optimization, and with /roundoff=0 or /scalaropt=1 , if needed.
If the program fails, reduce the setting to a lower KAP optimization level or a lower compiler optimization level, and try again. If you have success at this step, you can also try the suggestions found in Section 2.12.
If there are no routines with arrays and loops, run the whole program with /optimize=0 and /inline_ and_copy =aaa,bbb,ccc,.., where aaa, bbb, and so forth, are the most frequently called routines from the profiling run in step 3.
If this action succeeds, repeat with the /optimize=4 and /inline_and_copy=... qualifiers. If this action fails, try rerunning with /roundoff=0 or /scalaropt=1 or with fewer routines inlined. (See Section 2.13 for an explanation of binary chop.) Also, if you have success at this step, try the suggestions in Section 2.12.

Previous Page | Next Page | Contents | Index |
Command-Line Qualifiers