Towards understanding Post-recovery efficiency for shrinking and non-shrinking recovery

5Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

We explore the post-recovery efficiency of shrinking and nonshrinking recovery schemes on high performance computing systems using a synthetic benchmark. We study the impact of network topology on post-recovery communication performance. Our experiments on the IBM BG/Q System Mira show that shrinking recovery can deliver up to 7.5% better efficiency for neighbor communication pattern, as the non-shrinking recovery can reduce communication performance. We expected a similar situation for our synthetic benchmark with collective communication, but the situation is quite different. Both shrinking and non-shrinking recovery reduce MPI performance (MPICH3.1) dramatically on collective communication; up to 14× worse, swamping any differences between the two approaches. This suggests that making MPI performance less sensitive to irregularity in performance and communicator size are critical for both recovery approaches.

Cite

CITATION STYLE

APA

Fang, A., Fujita, H., & Chien, A. A. (2015). Towards understanding Post-recovery efficiency for shrinking and non-shrinking recovery. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9523, pp. 656–668). Springer Verlag. https://doi.org/10.1007/978-3-319-27308-2_53

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free