In this paper, we present a complete architecture for improving the dependability of complex COTS and legacy-based systems. For long-lived applications, such as most of those being constructed nowadays via integration of legacy subsystems, fault treatment is a very important part of the fault tolerance strategy. The paper advocates the need for careful diagnosis and damage assessment, and for precise and effective recovery actions, specifically tailored to the affecting fault and/or to the extent of the damage in the affected component. In our proposal, threshold-based mechanisms are exploited to trigger alternative actions. The design and implementation of the resulting solution is illustrated with respect to a case study. This consists of a distributed architectural framework, handling replicated legacy-based subsystems. Replication and voting are used for error detection and masking. An experimental prototype deployed over a COTS-based LAN is described and has allowed a dependability analysis, via combined use of direct measurements and analytical modeling. © Springer-Verlag Berlin Heidelberg 2003.
CITATION STYLE
Bondavalli, A., Chiaradonna, S., Cotroneo, D., & Romano, L. (2003). A fault-tolerant distributed legacy-based system and its evaluation. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Springer Verlag. https://doi.org/10.1007/978-3-540-45214-0_22
Mendeley helps you to discover research relevant for your work.