The sliced COO format for sparse matrix-vector multiplication on CUDA-enabled GPUs

18Citations
Citations of this article
29Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Existing formats for Sparse Matrix-Vector Multiplication (SpMV) on the GPU are outperforming their corresponding implementations on multi-core CPUs. In this paper, we present a new format called Sliced COO (SCOO) and an efficient CUDA implementation to perform SpMV on the GPU. While previous work shows experiments on small to medium-sized sparse matrices, we perform evaluations on large sparse matrices. We compared SCOO performance to existing formats of the NVIDIA Cusp library. Our resutls on a Fermi GPU show that SCOO outperforms the COO and CSR format for all tested matrices and the HYB format for all tested unstructured matrices. Furthermore, comparison to a Sandy-Bridge CPU shows that SCOO on a Fermi GPU outperforms the multi-threaded CSR implementation of the Intel MKL Library on an i7-2700K by a factor between 5.5 and 18. © 2012 Published by Elsevier Ltd.

Author supplied keywords

Cite

CITATION STYLE

APA

Dang, H. V., & Schmidt, B. (2012). The sliced COO format for sparse matrix-vector multiplication on CUDA-enabled GPUs. In Procedia Computer Science (Vol. 9, pp. 57–66). Elsevier B.V. https://doi.org/10.1016/j.procs.2012.04.007

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free