Improving compiler and run-time support for irregular reductions using local writes

6Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Current compilers for distributed-memory multiprocessors parallelize irregular reductions either by generating calls to sophisticated run-time systems (CHAOS) or by relying on replicated buffers and the shared-memory interface supported by software DSMs (TreadMarks). We introduce LocalWrite, a new technique for parallelizing irregular reductions based on the owner-computes rule. It eliminates the need for buffers or synchronized writes, but may replicate computation. We investigate the impact of connectivity (node/edge ratio), locality (accesses to local data) and adaptivity (edge modifications) on their relative performance. LocalWrite improves performance by 50{150% compared to using replicated buffers, and can match or exceed gather/scatter for applications with low locality or high adaptivity.

Cite

CITATION STYLE

APA

Han, H., & Tseng, C. W. (1999). Improving compiler and run-time support for irregular reductions using local writes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1656, pp. 181–196). Springer Verlag. https://doi.org/10.1007/3-540-48319-5_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free