A synchronization network (SN) consists of processing elements (PEs) at the leaves of a complete binary tree, with routing switches at interior nodes. We study the problem of rendering an SN tolerant to PE failures, by adding queues to its edges. We obtain the following results. In the worst-case, an N-PE SN whose edges have queues of capacity O(log log N) can tolerate the failure of a positive fraction of its PEs, no matter how the failed PEs are distributed; furthermore, this capacity requirement cannot be lowered by more than a small constant factor. In the expected-case, with probability exceeding 1-N-Ω(1), an N-PE SN whose edges have queues of capacity O(log log log N) can tolerate the failure of a positive fraction of its PEs; we do not know if this capacity requirement can be lowered. We also present an algorithm which, given an SN with queues of capacity C, salvages a maximum number of fault-free PEs; the running time is a low-degree polynomial in N even when C is as large as log(N/log N).
CITATION STYLE
Bhatt, S. N., Chung, F. R. K., Leighton, F. T., & Rosenberg, A. L. (1992). Tolerating faults in synchronization networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 634 LNCS, pp. 1–12). Springer Verlag. https://doi.org/10.1007/3-540-55895-0_391
Mendeley helps you to discover research relevant for your work.