Breaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors

3Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

The low utilization of SIMD units and memory bandwidth is the main performance bottleneck on SIMD processors for sparse matrix-vector multiplication (SpMV), which is one of the most important kernels in many scientific and engineering applications. This paper proposes a hybrid optimization method to break the performance bottleneck of SpMV on SIMD processors. The method includes a new sparse matrix compressed format, a block SpMV algorithm, and a vector write buffer. Experimental results show that our hybrid optimization method can achieve an average speedup of 2.09 over CSR vector kernel for all the matrices. The maximum speedup can go up to 3.24. © IEICE 2013.

Cite

CITATION STYLE

APA

Zhang, K., Chen, S., Wang, Y., & Wan, J. (2013). Breaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors. IEICE Electronics Express, 10(9). https://doi.org/10.1587/elex.10.20130147

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free