The performance of linear relaxation codes strongly depends on an efficient usage of caches. This paper considers one time step of the Jacobi and Gauß-Seidel kernels on a 3D array, and shows that tiling reduces the number of capacity misses to almost optimum. In particular, we prove that Ω(N3/(L √ C)) capacity misses are needed for array size N × N × N, cache size C, and line size L. If cold misses are taken into account, tiling is off the lower bound by a factor of about 1+5/ √ LC. The exact value depends on tile size and data layout. We show analytically that rectangular tiles of shape (N -2) × s × (sL/2) outperform square tiles, for row-major storage order. © 2002 Springer-Verlag Berlin Heidelberg.
CITATION STYLE
Leopold, C. (2002). Tight bounds on capacity misses for 3D stencil codes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2329 LNCS, pp. 843–852). Springer Verlag. https://doi.org/10.1007/3-540-46043-8_85
Mendeley helps you to discover research relevant for your work.