A well-known code transformation for improving the run-time performance of a program is loop unrolling. The most obvious benefit of unrolling a loop is that the transformed loop usually requires fewer instruction executions than the original loop. The reduction in instruction executions comes from two sources: the number of branch instructions executed is reduced, and the control variable is modified fewer times. In addition, for architectures with features designed to exploit instruction-level parallelism, loop unrolling can expose greater levels of instruction-level parallelism. Loop unrolling is an effective code transformation often improving the execution performance of programs that spend much of their execution time in loops by 10 to 30 percent. Possibly because of the effectiveness of a simple application of loop unrolling, it has not been studied as extensively as other code improvements such as register allocation or common subexpression elimination. The result is that many compilers employ simplistic loop unrolling algorithms that miss many opportunities for improving run-time performance. This paper describes how aggressive loop unrolling is done in a retargetable optimizing compiler. Using a set of 32 benchmark programs, the effectiveness of this more aggressive approach to loop unrolling is evaluated. The results show that aggressive loop unrolling can yield additional performance increase of 10 to 20 percent over the simple, naive approaches employed by many production compilers.
CITATION STYLE
Davidson, J. W., & Jinturkar, S. (1996). Aggressive loop unrolling in a retargetable, optimizing compiler. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1060, pp. 59–73). Springer Verlag. https://doi.org/10.1007/3-540-61053-7_53
Mendeley helps you to discover research relevant for your work.