A Watchdog Processor Based General Rollback Technique with Multiple Retries

  • Upadhyaya J
  • Saluja K
  • 3

    Readers

    Mendeley users who have this article in their library.
  • 11

    Citations

    Citations of this article.

Abstract

A common assumption in the existing rollback techniques is that transients, the cause of most failures, subside very quickly, implying that a single story retry of the program from the previous rollback point is sufficient. The authors discuss a general rollback strategy with n(n≥2) retries which takes into consideration multiple transient failures as well as transients of long duration. Ways of deriving practical values of n for a given program are also discussed. Furthermore, the authors propose the use of a watchdog processor as an error detection tool to initiate recovery action through rollback, since the watchdog processor offers low error latency. They also discuss the merging of the watchdog processor with rollback recovery technique for enhancing the overall system reliability.

Author-supplied keywords

  • Error detection
  • error latency
  • program retry
  • recovery time
  • roll back recovery
  • transient errors

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Get full text

Authors

  • J. Shambhu Upadhyaya

  • Kewal K. Saluja

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free