Global tiling for communication minimal parallelization on distributed memory systems

3Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Most previous studies on tiling focus on the division of iteration space. However, on distributed memory parallel systems, the decomposition of computation and the distribution of data must be handled at the same time, in order to attain load balancing and to minimize data migration. In this paper, we formulate a 0-1 integer linear programming for the problem of globally optimal tiling to minimize the total execution time. To simplify the selection of tiling parameters, we restrict the tile shape to semi-oblique shape, and present two effective approaches to decide the tile shape for multi-dimensional semi-oblique shaped tiling. Besides, we present a tile-to-processor mapping scheme based on hyperplanes, which can express diverse parallelism and gain better performance than traditional methods. The experimentations with NPB2.3-serial SP and LU on Qsnet connected cluster achieved the average parallel efficiency of 87% and 73% respectively. © 2008 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Liu, L., Chen, L., Wu, C., & Feng, X. B. (2008). Global tiling for communication minimal parallelization on distributed memory systems. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5168 LNCS, pp. 382–391). https://doi.org/10.1007/978-3-540-85451-7_41

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free