The sliced COO format for sparse matrix-vector multiplication on CUDA-enabled GPUs

Hoang Vu Dang; Bertil Schmidt

Conference ProceedingsOPEN ACCESS

The sliced COO format for sparse matrix-vector multiplication on CUDA-enabled GPUs

Procedia Computer Science (2012) 9 57-66

DOI: 10.1016/j.procs.2012.04.007

18Citations

29Readers

Abstract

Existing formats for Sparse Matrix-Vector Multiplication (SpMV) on the GPU are outperforming their corresponding implementations on multi-core CPUs. In this paper, we present a new format called Sliced COO (SCOO) and an efficient CUDA implementation to perform SpMV on the GPU. While previous work shows experiments on small to medium-sized sparse matrices, we perform evaluations on large sparse matrices. We compared SCOO performance to existing formats of the NVIDIA Cusp library. Our resutls on a Fermi GPU show that SCOO outperforms the COO and CSR format for all tested matrices and the HYB format for all tested unstructured matrices. Furthermore, comparison to a Sandy-Bridge CPU shows that SCOO on a Fermi GPU outperforms the multi-threaded CSR implementation of the Intel MKL Library on an i7-2700K by a factor between 5.5 and 18. © 2012 Published by Elsevier Ltd.

Author supplied keywords

Cite

CITATION STYLE

APA

Dang, H. V., & Schmidt, B. (2012). The sliced COO format for sparse matrix-vector multiplication on CUDA-enabled GPUs. In Procedia Computer Science (Vol. 9, pp. 57–66). Elsevier B.V. https://doi.org/10.1016/j.procs.2012.04.007

The sliced COO format for sparse matrix-vector multiplication on CUDA-enabled GPUs

Abstract

Author supplied keywords

Cite

Register to see more suggestions