The computation and communication complexity of a parallel banded system solver

47Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.

Abstract

ent an algorithm for solving banded positive defimte linear systems on a multiprocessor computer whose number of processors p is much less than the order of the system n. Assuming that the banded matrix, of bandwidth 2m + 1, is stored in the global memory by diagonals as several onedlmensmnal arrays, we consider the time required by several alignment networks for allocating the appropriate data to the local memory of each processor. We demonstrate that the time required in this preprocessmg stage does not exceed that required by the algorithm provided we use a shuffle exchange, a plpelmed shuffle exchange, or a crossbar switch. Once the data are allocated in the local memories, the algorithm requires only a “nearest neighbor” alignment network to achieve a total time of O(m2n/p). The total cost of the algorithm is minimized when p. © 1984, ACM. All rights reserved.

Cite

CITATION STYLE

APA

Lawrie, D. H., & Sameh, A. H. (1984). The computation and communication complexity of a parallel banded system solver. ACM Transactions on Mathematical Software (TOMS), 10(2), 185–195. https://doi.org/10.1145/399.401

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free