Minimizing communication in sparse matrix solvers

Marghoob Mohiyuddin; Mark Hoemmen; James Demmel; Katherine Yelick

Conference Proceedings

Minimizing communication in sparse matrix solvers

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC '09 (2009)

DOI: 10.1145/1654059.1654096

86Citations

56Readers

Get full text

Abstract

Data communication within the memory system of a single processor node and between multiple nodes in a system is the bottleneck in many iterative sparse matrix solvers like CG and GMRES. Here k iterations of a conventional implementation perform k sparse-matrix-vector-multiplications and Ω(k) vector operations like dot products, resulting in communication that grows by a factor of Ω(k) in both the memory and network. By reorganizing the sparse-matrix kernel to compute a set of matrix-vector products at once and reorganizing the rest of the algorithm accordingly, we can perform k iterations by sending O(log P) messages instead of O(k·log P) messages on a parallel machine, and reading the matrix A from DRAM to cache just once, instead of k times on a sequential machine. This reduces communication to the minimum possible. We combine these techniques to form a new variant of GMRES. Our shared-memory implementation on an 8-core Intel Clovertown gets speedups of up to 4.3x over standard GMRES, without sacrificing convergence rate or numerical stability. Copyright 2009 ACM.

Cite

CITATION STYLE

APA

Mohiyuddin, M., Hoemmen, M., Demmel, J., & Yelick, K. (2009). Minimizing communication in sparse matrix solvers. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC ’09. https://doi.org/10.1145/1654059.1654096

Minimizing communication in sparse matrix solvers

Abstract

Cite

Register to see more suggestions