The Checkpoint-Timing for Backward Fault-Tolerant Schemes

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

To improve the performance of the backward fault tolerant scheme in the long-running parallel application, a general checkpoint-timing method was proposed to determine the unequal checkpointing interval according to an arbitrary failure rate, to reduce the total execution time. Firstly, a new model was introduced to evaluate the mean expected execution time. Secondly, the optimality condition was derived for the constant failure rate according to the calculation model, and the optimal equal checkpointing interval can be obtained easily. Subsequently, a general method was derived to determine the checkpointing timing for the other failure rate. The final results shown the proposal is practical to trade-off the re-processing overhead and the checkpointing overhead in the backward fault-tolerant scheme.

Cite

CITATION STYLE

APA

Zhang, M. (2018). The Checkpoint-Timing for Backward Fault-Tolerant Schemes. In Communications in Computer and Information Science (Vol. 908, pp. 210–218). Springer Verlag. https://doi.org/10.1007/978-981-13-2423-9_16

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free