A survey on fault management techniques in distributed computing

6Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Now-a-days with the rapid increase in distributed computing systems faults are equally enhancing in scales in spite of many fault detection techniques proposed. Designing and implementing distributed computing systems is challenging due to their ever- increasing scales and the complexity. A faulty distributed system due to any reason during executing its processes can cause some damages. A fault management system helps the distributed systems by detecting malfunctions, errors or faults etc., We investigated different techniques of fault tolerance used in real time distributed system. The main concentration is on types of faults, fault detection techniques and their recovery techniques used. Link failure, resource failure or any other failure is to be detected and rectified for working the system accurately without any disturbances. The fault management applications are hereby enabled to determine the root cause of distributed systems failure automatically. In order to aspect faults detection in distributed systems we propose to combine proactive and reactive techniques in an expert system for managing the faults. © 2013 Springer-Verlag.

Cite

CITATION STYLE

APA

Kavila, S. D., Prasada Raju, G. S. V., Satapathy, S. C., Machiraju, A., Kinnera, G. V. L., & Rasly, K. (2013). A survey on fault management techniques in distributed computing. In Advances in Intelligent Systems and Computing (Vol. 199 AISC, pp. 593–602). Springer Verlag. https://doi.org/10.1007/978-3-642-35314-7_67

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free