Flexible batched sparse matrix-vector product on GPUs

Hartwig Anzt; Gary Collins; Jack Dongarra; Goran Flegar; Enrique S. Quintana-Ort

Conference Proceedings

Flexible batched sparse matrix-vector product on GPUs

Proceedings of ScalA 2017: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - Held in conjunction with SC 2017: The International Conference for High Performance Computing, Networking, Storage and Analysis (2017)

DOI: 10.1145/3148226.3148230

5Citations

7Readers

Get full text

Abstract

We propose a variety of batched routines for concurrently processing a large collection of small-size, independent sparse matrixvector products (SpMV) on graphics processing units (GPUs). These batched SpMV kernels are designed to be flexible in order to handle a batch of matrices which differ in size, nonzero count, and nonzero distribution. Furthermore, they support three most commonly used sparse storage formats: CSR, COO and ELL. Our experimental results on a state-of-the-art GPU reveal performance improvements of up to 25 compared to non-batched SpMV routines.

Author supplied keywords

Cite

CITATION STYLE

APA

Anzt, H., Collins, G., Dongarra, J., Flegar, G., & Quintana-Ort, E. S. (2017). Flexible batched sparse matrix-vector product on GPUs. In Proceedings of ScalA 2017: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - Held in conjunction with SC 2017: The International Conference for High Performance Computing, Networking, Storage and Analysis. Association for Computing Machinery, Inc. https://doi.org/10.1145/3148226.3148230

Flexible batched sparse matrix-vector product on GPUs

Abstract

Author supplied keywords

Cite

Register to see more suggestions