Current compilers for distributed-memory multiprocessors parallelize irregular reductions either by generating calls to sophisticated run-time systems (CHAOS) or by relying on replicated buffers and the shared-memory interface supported by software DSMs (TreadMarks). We introduce LocalWrite, a new technique for parallelizing irregular reductions based on the owner-computes rule. It eliminates the need for buffers or synchronized writes, but may replicate computation. We investigate the impact of connectivity (node/edge ratio), locality (accesses to local data) and adaptivity (edge modifications) on their relative performance. LocalWrite improves performance by 50{150% compared to using replicated buffers, and can match or exceed gather/scatter for applications with low locality or high adaptivity.
CITATION STYLE
Han, H., & Tseng, C. W. (1999). Improving compiler and run-time support for irregular reductions using local writes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1656, pp. 181–196). Springer Verlag. https://doi.org/10.1007/3-540-48319-5_12
Mendeley helps you to discover research relevant for your work.