Automatic tiling of iterative stencil loops

  • Li Z
  • Song Y
  • 23

    Readers

    Mendeley users who have this article in their library.
  • 43

    Citations

    Citations of this article.

Abstract

Iterative stencil loops are used in scientific programs to implement relaxation methods for numerical simulation and signal processing. Such loops iteratively modify the same array elements over different time steps, which presents opportunities for the compiler to improve the temporal data locality through loop tiling. This article presents a compiler framework for automatic tiling of iterative stencil loops, with the objective of improving the cache performance. The article first presents a technique which allows loop tiling to satisfy data dependences in spite of the difficulty created by imperfectly nested inner loops. It does so by skewing the inner loops over the time steps and by applying a uniform skew factor to all loops at the same nesting level. Based on a memory cost analysis, the article shows that the skew factor must be minimized at every loop level in order to minimize cache misses. A graph-theoretical algorithm, which takes polynomial time, is presented to determine the minimum skew factor. Furthermore, the memory-cost analysis derives the tile size which minimizes capacity misses. Given the tile size, an efficient and general array-padding scheme is applied to remove conflict misses. Experiments were conducted on 16 test programs and preliminary results showed an average speedup of 1.58 and a maximum speedup of 5.06 across those test programs.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Get full text

Authors

  • Zhiyuan Li

  • Yonghong Song

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free