Efficient Communication/Computation Overlap with MPI+OpenMP Runtimes Collaboration

5Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Overlap network communications and computations is a major requirement to ensure scalability of HPC applications on future exascale machines. To this purpose the de-facto MPI standard provides non-blocking routines for asynchronous communication progress. In various implementations, a dedicated progress thread (PT) is deployed on the host CPU to actually achieve this overlap. However, current PT solutions struggle to find a balance between efficient detection of network events and minimal impact on the application computations. In this paper we propose a solution inspired from the PT approach which benefits from idle time of compute threads to make MPI communication progress in background. We implement our idea in the context of MPI+OpenMP collaboration using the OpenMP Tools interface which will be part of the OpenMP 5.0 standard. Our solution shows an overall performance gain on unbalanced workloads such as the AMG CORAL benchmark.

Cite

CITATION STYLE

APA

Sergent, M., Dagrada, M., Carribault, P., Jaeger, J., Pérache, M., & Papauré, G. (2018). Efficient Communication/Computation Overlap with MPI+OpenMP Runtimes Collaboration. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11014 LNCS, pp. 560–572). Springer Verlag. https://doi.org/10.1007/978-3-319-96983-1_40

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free