Matrix-matrix multiplication is an important linear algebra operation with a myriad of applications in scientific and engineering computing. Due to the relevance and inner parallelism of this operation, there exist many high performance implementations for a variety of hardware platforms. Exploit the structure of the matrices involved in the operation in general provides relevant time and memory savings. This is the case, e.g., when one of the matrices is a symmetric band matrix. This work presents two efficient specialized implementations of the operation when a symmetric band matrix is involved and the target architecture contains a graphics processor (GPU). In particular, both implementations exploit the structure of the matrices to leverage the vast parallelism of the underlying hardware. The experimental results show remarkable reductions in the computation time over the tuned implementations of the same operation provided by MKL and CUBLAS.
CITATION STYLE
Dufrechou, E., Ezzatti, P., Quintana-Ortí, E. S., & Remón, A. (2014). Efficient symmetric band matrix-matrix multiplication on GPUs. In Communications in Computer and Information Science (Vol. 485, pp. 1–12). Springer Verlag. https://doi.org/10.1007/978-3-662-45483-1_1
Mendeley helps you to discover research relevant for your work.