Implementing blocked sparse matrix-vector multiplication on NVIDIA GPUs

Alexander Monakov; Arutyun Avetisyan

Conference Proceedings

Implementing blocked sparse matrix-vector multiplication on NVIDIA GPUs

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5657 LNCS 289-297

DOI: 10.1007/978-3-642-03138-0_32

24Citations

32Readers

Get full text

Abstract

We discuss implementing blocked sparse matrix-vector multiplication for NVIDIA GPUs. We outline an algorithm and various optimizations, and identify potential future improvements and challenging tasks. In comparison with previously published implementation, our implementation is faster on matrices having many high fill-ratio blocks but slower on matrices with low number of non-zero elements per row. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Monakov, A., & Avetisyan, A. (2009). Implementing blocked sparse matrix-vector multiplication on NVIDIA GPUs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5657 LNCS, pp. 289–297). https://doi.org/10.1007/978-3-642-03138-0_32

Implementing blocked sparse matrix-vector multiplication on NVIDIA GPUs

Abstract

Cite

Register to see more suggestions