Leveraging MPI-3 shared-memory extensions for efficient PGAS runtime systems

Huan Zhou; Kamran Idrees; José Gracia

Conference ProceedingsOPEN ACCESS

Leveraging MPI-3 shared-memory extensions for efficient PGAS runtime systems

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9233 373-384

DOI: 10.1007/978-3-662-48096-0_29

11Citations

8Readers

Abstract

The relaxed semantics and rich functionality of one-sided communication primitives of MPI-3 makes MPI an attractive candidate for the implementation of PGAS models. However, the performance of such implementation suffers from the fact, that current MPI RMA implementations typically have a large overhead when source and target of a communication request share a common, local physical memory. In this paper, we present an optimized PGAS-like runtime system which uses the new MPI-3 shared-memory extensions to serve intra-node communication requests and MPI-3 one-sided communication primitives to serve inter-node communication requests. The performance of our runtime system is evaluated on a Cray XC40 system through low-level communication benchmarks, a random-access benchmark and a stencil kernel. The results of the experiments demonstrate that the performance of our hybrid runtime system matches the performance of low-level RMA libraries for intranode transfers, and that of MPI-3 for inter-node transfers.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Zhou, H., Idrees, K., & Gracia, J. (2015). Leveraging MPI-3 shared-memory extensions for efficient PGAS runtime systems. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9233, pp. 373–384). Springer Verlag. https://doi.org/10.1007/978-3-662-48096-0_29

Readers' Seniority

PhD / Post grad / Masters / Doc 5

83%

Researcher 1

17%

Readers' Discipline

Computer Science 6

75%

Engineering 2

25%

Leveraging MPI-3 shared-memory extensions for efficient PGAS runtime systems

Abstract

Author supplied keywords

References Powered by Scopus

Global Arrays: A nonuniform memory access programming model for high-performance computers

Efficient asynchronous memory copy operations on multi-core systems and I/OAT

Dash: Data structures and algorithms with support for hierarchical locality

Cited by Powered by Scopus

DASH: A C++ PGAS library for distributed data structures and parallel algorithms

MPI collectives for multi-core clusters: Optimized performance of the hybrid MPI+MPI parallel codes

Collectives in hybrid MPI+MPI code: Design, practice and performance

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline