Transparent fault tolerance solution at socket level based on RADIC

0Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present a transparent middleware for fault tolerance based on RADIC, Redundant Array of Distributed Independent Controllers, a transparent and scalable fault tolerant architecture for parallel applications. It is designed at socket level and makes a secure tunnel connection able to keep the tcp sessions established by the application in spite of node failures. It is located at user level and is independent of the message-passing communication library being used. The protection gets through uncoordinated checkpoints and log message and the recovery are done in a automatic way so in case of node failures there is no need of intervention of the administrator. We have tested our fault tolerance system by executing a master-worker (M/W) and SPMD applications that follow different communication patterns. © 2012 IEEE.

Cite

CITATION STYLE

APA

Castro, M., Rexachs, D., & Luque, E. (2012). Transparent fault tolerance solution at socket level based on RADIC. In Proceedings of the 2012 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2012 (pp. 831–832). https://doi.org/10.1109/ISPA.2012.121

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free