Model-based failure management for distributed reactive systems

4Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Failure management is key to the development of safety-critical, distributed, reactive systems common in such applications as avionics, automotive, and sensor/actuator networks. Specific challenges to effective failure management include (i) developing an understanding of the application domain so as to define what constitutes a failure; (ii) disentangling failure management concepts at design and runtime; and (iii) detecting and mitigating failures at the level of systems-of-systems integration. In this paper, we address (i) and (ii) by developing a failure ontology for logical and deployment architectures, respectively, including a mapping between the two. This ontology is based on the interaction patterns (or services) defining the component interplay in a distributed system. We address (iii) by defining detectors and mitigators at the service/interaction level - we discuss how to derive detectors for a significant subset of the failure ontology directly from the interaction patterns. We demonstrate the utility of our techniques using a large scale oceano-graphic sensor/actuator network. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Ermagan, V., Krüger, I., & Menarini, M. (2007). Model-based failure management for distributed reactive systems. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4888 LNCS, pp. 53–74). https://doi.org/10.1007/978-3-540-77419-8_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free