Optimal checkpointing period: Time vs. energy

8Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This short paper deals with parallel scientific applications using non-blocking and periodic coordinated checkpointing to enforce resilience. We provide a model and detailed formulas for total execution time and consumed energy. We characterize the optimal period for both objectives, and we assess the range of time/energy trade-offs to be made by instantiating the model with a set of realistic scenarios for Exascale systems. We give a particular emphasis to I/O transfers, because the relative cost of communication is expected to dramatically increase, both in terms of latency and consumed energy, for future Exascale platforms.

Cite

CITATION STYLE

APA

Aupy, G., Benoit, A., Hérault, T., Robert, Y., Dongarra, J., & Robert, Y. (2014). Optimal checkpointing period: Time vs. energy. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8551, pp. 203–214). Springer Verlag. https://doi.org/10.1007/978-3-319-10214-6_10

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free