Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks

363Citations
Citations of this article
156Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper introduces a storage format for sparse matrices, called compressed sparse blocks (CSB), which allows both Ax and ATx to be computed efficiently in parallel, where A is an n x n sparse matrix with nnz ≥ n nonzeros and x is a dense n-vector. Our algorithms use Θ(nnz) work (serial running time) and Θ(√nlgn) span (critical-path length), yielding a parallelism of Θ(nnz/√nlgn), which is amply high for virtually any large matrix. The storage requirement for CSB is esssentially the same as that for the more-standard compressed-sparse-rows (CSR) format, for which computing Ax in parallel is easy but ATx is difficult. Benchmark results indicate that on one processor, the CSB algorithms for Ax and ATx run just as fast as the CSR algorithm for Ax, but the CSB algorithms also scale up linearly with processors until limited by off-chip memory bandwidth. Copyright 2009 ACM.

Cite

CITATION STYLE

APA

Buluç, A., Fineman, J. T., Frigo, M., Gilbert, J. R., & Leiserson, C. E. (2009). Parallel sparse matrix-vector and matrix-transpose-vector multiplication using compressed sparse blocks. In Annual ACM Symposium on Parallelism in Algorithms and Architectures (pp. 233–244). https://doi.org/10.1145/1583991.1584053

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free