Recovery tasks: An automated approach to failure recovery

Brian Demsky; Jin Zhou; William Montaz

Conference Proceedings

Recovery tasks: An automated approach to failure recovery

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2010) 6418 LNCS 229-244

DOI: 10.1007/978-3-642-16612-9_18

0Citations

8Readers

Get full text

Abstract

We present a new approach for developing robust software applications that breaks dependences on the failed parts of an application's execution to allow the rest of the application to continue executing. When a failure occurs, the recovery algorithm uses information from a static analysis to characterize the intended behavior of the application had it not failed. It then uses this characterization to recover as much of the application's execution as possible. We have implemented this approach in the Bristlecone compiler. We have evaluated our implementation on a multiplayer game, a web portal, and a MapReduce framework. We found that in the presence of injected failures, the recovery task version provided substantially better service than the control versions. Moreover, the recovery task version of the game benchmark successfully recovered from a real fault that we accidentally introduced during development, while the same fault caused the two control versions to crash. © 2010 Springer-Verlag.

Cite

CITATION STYLE

APA

Demsky, B., Zhou, J., & Montaz, W. (2010). Recovery tasks: An automated approach to failure recovery. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6418 LNCS, pp. 229–244). https://doi.org/10.1007/978-3-642-16612-9_18

Recovery tasks: An automated approach to failure recovery

Abstract

Cite

Register to see more suggestions