An efficient fault-tolerant routing methodology for fat-tree interconnection networks

6Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In large cluster-based machines, fault-tolerance in the interconnection network is an issue of growing importance, since their increasing size rises the probability of failure. The topology used in these machines is usually a fat-tree. This paper proposes a new distributed fault-tolerant routing methodology for fattrees. It does not require additional network hardware. It is scalable, since the required memory, switch hardware and routing delay do not depend on the network size. The methodology is based on enhancing the Interval Routing scheme with exclusion intervals. Exclusion intervals are associated to each switch output port, and represent the set of nodes that are unreachable from this port after a failure appears. We propose a mechanism to identify the exclusion intervals that must be updated after detecting a failure, and the values to write on them. Our methodology is able to support a relatively high number of network failures with a low degradation in network performance. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Gómez, C., Gómez, M. E., López, P., & Duato, J. (2007). An efficient fault-tolerant routing methodology for fat-tree interconnection networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4742 LNCS, pp. 509–522). Springer Verlag. https://doi.org/10.1007/978-3-540-74742-0_46

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free