Parallel efficient sparse matrix-matrix multiplication on multicore platforms

Md Mostofa Ali Patwary; Nadathur Rajagopalan Satish; Narayanan Sundaram; Jongsoo Park; Michael J. Anderson; Satya Gautam Vadlamudi; Dipankar Das; Sergey G. Pudov; Vadim O. Pirogov; Pradeep Dubey

Conference Proceedings

Parallel efficient sparse matrix-matrix multiplication on multicore platforms

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2015) 9137 LNCS 48-57

DOI: 10.1007/978-3-319-20119-1_4

47Citations

21Readers

Get full text

Abstract

Sparse matrix-matrix multiplication (SpGEMM) is a key kernel in many applications in High Performance Computing such as algebraic multigrid solvers and graph analytics. Optimizing SpGEMM on modern processors is challenging due to random data accesses, poor data locality and load imbalance during computation. In this work, we investigate different partitioning techniques, cache optimizations (using dense arrays instead of hash tables), and dynamic load balancing on SpGEMM using a diverse set of real-world and synthetic datasets. We demonstrate that our implementation outperforms the state-of-the-art using Intel® Xeon® processors. We are up to 3.8X faster than Intel® Math Kernel Library (MKL) and up to 257X faster than CombBLAS.We also outperform the best published GPU implementation of SpGEMM on nVidia GTX Titan and on AMD Radeon HD 7970 by up to 7.3X and 4.5X, respectively on their published datasets. We demonstrate good multi-core scalability (geomean speedup of 18.2X using 28 threads) as compared to MKL which gets 7.5X scaling on 28 threads.

Cite

CITATION STYLE

APA

Patwary, M. M. A., Satish, N. R., Sundaram, N., Park, J., Anderson, M. J., Vadlamudi, S. G., … Dubey, P. (2015). Parallel efficient sparse matrix-matrix multiplication on multicore platforms. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9137 LNCS, pp. 48–57). Springer Verlag. https://doi.org/10.1007/978-3-319-20119-1_4

Parallel efficient sparse matrix-matrix multiplication on multicore platforms

Abstract

Cite

Register to see more suggestions