Reliable parallel programming model for distributed computing environments

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

With the advent of large-scale heterogeneous platforms such as clusters and grids, resource failures are more likely to occur and have an adverse effect on the applications. Consequently, there is an increasing need for developing techniques to achieve reliability during execution. This paper presents FT-Jace, a new reliable programming model for grid computing environments. FT-JACE achieves reliability in a transparent manner for the programmer. It is based on active replication scheme, capable of supporting r arbitrary fail-silent (a faulty node does not produce any output) and fail-stop (no node recovery) node failures. The strength of our programming environment is that the deployment of the application does not require complicated mechanisms for failure detection. More precisely, node failures are masked and there is no need for detecting and handling such failures. We provide experimental results conducted on Grid'5000 platform to demonstrate the usefulness of FT-Jace. © 2010 Springer-Verlag.

Cite

CITATION STYLE

APA

Bahi, J. M., Hakem, M., & Mazouzi, K. (2010). Reliable parallel programming model for distributed computing environments. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6043 LNCS, pp. 162–171). https://doi.org/10.1007/978-3-642-14122-5_20

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free