Performance of random sampling for computing low-rank approximations of a dense matrix on GPUs

Théo Mary; Ichitaro Yamazaki; Jakub Kurzak; Piotr Luszczek; Stanimire Tomov; Jack Dongarra

Conference ProceedingsOPEN ACCESS

Performance of random sampling for computing low-rank approximations of a dense matrix on GPUs

International Conference for High Performance Computing, Networking, Storage and Analysis, SC (2015) 15-20-November-2015

DOI: 10.1145/2807591.2807613

5Citations

23Readers

Abstract

A low-rank approximation of a dense matrix plays an important role in many applications. To compute such an approximation, a common approach uses the QR factorization with column pivoting (QRCP). Though the reliability and efficiency of QRCP have been demonstrated, this deterministic approach requires costly communication at each step of the factorization. Since such communication is becoming increasingly expensive on modern computers, an alternative approach based on random sampling, which can be implemented using communication-optimal kernels, is becoming attractive. To study its potential, in this paper, we compare the performance of random sampling with that of QRCP on an NVIDIA Kepler GPU. Our performance results demonstrate that random sampling can be up to 12.8x faster than the deterministic approach for computing the approximation of the same accuracy. We also present the parallel scaling of the random sampling over multiple GPUs on a single compute node, showing a speedup of 3.8x over three Kepler GPUs. These results demonstrate the potential of the random sampling as an excellent computational tool for many applications, and its potential is likely to grow on the emerging computers with the increasing communication costs.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Mary, T., Yamazaki, I., Kurzak, J., Luszczek, P., Tomov, S., & Dongarra, J. (2015). Performance of random sampling for computing low-rank approximations of a dense matrix on GPUs. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC (Vol. 15-20-November-2015). IEEE Computer Society. https://doi.org/10.1145/2807591.2807613

Readers' Seniority

PhD / Post grad / Masters / Doc 13

72%

Researcher 3

17%

Professor / Associate Prof. 2

11%

Readers' Discipline

Computer Science 14

74%

Physics and Astronomy 2

11%

Mathematics 2

11%

Engineering 1

Performance of random sampling for computing low-rank approximations of a dense matrix on GPUs

Abstract

References Powered by Scopus

The international HapMap project

Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions

CUR matrix decompositions for improved data analysis

Cited by Powered by Scopus

Numerical algorithms for high-performance computational science

A new preconditioner that exploits low-rank approximations to factorization error

Single-pass PCA of large high-dimensional data

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline