Designing parallel sparse matrix transposition algorithm using CSR for GPUs

Tien Hsiung Weng; Hoa Pham; Hai Jiang; Kuan Ching Li

Conference Proceedings

Designing parallel sparse matrix transposition algorithm using CSR for GPUs

Lecture Notes in Electrical Engineering (2013) 234 LNEE 251-257

DOI: 10.1007/978-1-4614-6747-2_31

4Citations

5Readers

Get full text

Abstract

In this chapter, we propose a parallel algorithm for sparse matrix transposition using CSR format to run on many-core GPUs, utilizing the tremendous computational power and memory bandwidth of the GPU offered by parallel programming in CUDA. Our code is run on a quad-core Intel Xeon64 CPU E5507 platform and a NVIDIA GPU GTX 470 card. We measure the performance of our algorithm running with input ranging from smaller to larger matrices, and our experimental results show that the preliminary results are scaling well up to 512 threads and are promising for bigger matrices. © 2013 Springer Science+Business Media New York.

Author supplied keywords

Cite

CITATION STYLE

APA

Weng, T. H., Pham, H., Jiang, H., & Li, K. C. (2013). Designing parallel sparse matrix transposition algorithm using CSR for GPUs. In Lecture Notes in Electrical Engineering (Vol. 234 LNEE, pp. 251–257). https://doi.org/10.1007/978-1-4614-6747-2_31

Designing parallel sparse matrix transposition algorithm using CSR for GPUs

Abstract

Author supplied keywords

Cite

Register to see more suggestions