Flexible batched sparse matrix-vector product on GPUs

6Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

We propose a variety of batched routines for concurrently processing a large collection of small-size, independent sparse matrixvector products (SpMV) on graphics processing units (GPUs). These batched SpMV kernels are designed to be flexible in order to handle a batch of matrices which differ in size, nonzero count, and nonzero distribution. Furthermore, they support three most commonly used sparse storage formats: CSR, COO and ELL. Our experimental results on a state-of-the-art GPU reveal performance improvements of up to 25 compared to non-batched SpMV routines.

Cite

CITATION STYLE

APA

Anzt, H., Collins, G., Dongarra, J., Flegar, G., & Quintana-Ort, E. S. (2017). Flexible batched sparse matrix-vector product on GPUs. In Proceedings of ScalA 2017: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - Held in conjunction with SC 2017: The International Conference for High Performance Computing, Networking, Storage and Analysis. Association for Computing Machinery, Inc. https://doi.org/10.1145/3148226.3148230

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free