Abstract
This paper presents an overview of our work, concerning a complete end-to-end framework for automatically generating message passing parallel code for tiled nested for-loops. It considers general parallelepiped tiling transformations and general convex iteration spaces. We address all problems regarding both the generation of sequential tiled code and its parallelization. We have implemented our techniques in a tool which automatically generates MPI parallel code and conducted several series of experiments, concerning the compilation time of our tool, the efficiency of the generated code and the speedup attained on a cluster of PCs. Apart from confirming the value of our techniques, our experimental results show the merit of general parallelepiped tiling transformations and verify previous theoretical work on scheduling-optimal tile shapes.
Author supplied keywords
Cite
CITATION STYLE
Goumas, G., Drosinos, N., Athanasaki, M., & Koziris, N. (2004). Automatic parallel code generation for tiled nested loops. In Proceedings of the ACM Symposium on Applied Computing (Vol. 2, pp. 1412–1419). Association for Computing Machinery. https://doi.org/10.1145/967900.968184
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.