The Modified Gram-Schmidt (MGS) orthogonalization process - used for example in the Arnoldi algorithm - constitutes often the bottleneck that limits parallel efficiencies. Indeed, a number of communications, proportional to the square of the problem size, is required to compute the dot-products. A block formulation is attractive but it suffers from potential numerical instability. In this paper, we address this issue and propose a simple procedure that allows the use of a Block Gram-Schmidt algorithm while guaranteeing a numerical accuracy similar to MGS. The main idea is to dynamically determine the size of the blocks. The main advantage of this dynamic procedure are two-folds: first, high performance matrix-vector multiplications can be used to decrease the execution time. Next, in a parallel environment, the number of communications is reduced. Performance comparisons with the alternative Iterated CGS also show an improvement for moderate number of processors. © Springer-Verlag Berlin Heidelberg 1999.
CITATION STYLE
Vanderstraeten, D. (1999). A stable and efficient parallel block Gram-Schmidt algorithm. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1685 LNCS, pp. 1128–1135). Springer Verlag. https://doi.org/10.1007/3-540-48311-x_158
Mendeley helps you to discover research relevant for your work.