Singular value decomposition (SVD) is a matrix factorization method widely used for dimension reduction, data analytics, information retrieval, and unsupervised learning. In general, only singular values of SVD are needed for most big-data applications. Methods such as tensor networks require an accurate computation of a substantial number of singular vectors, which can be accomplished through truncated SVD (tSVD). Additionally, many real-world datasets are too big to fit into the available memory, which mandates the development of out of memory algorithms that assume that most of the data resides on an external disk during the entire computation. These algorithms reduce communication to disk and hide part of the communication by overlapping it with communication on blocks of work. Here, building upon previous works on SVD for dense matrices, we present a method for computation of a predetermined number, K , of SVD singular vectors, and the corresponding K singular values, of a matrix that cannot fit in the memory. Our out of memory tSVD can be used for tensor networks algorithms. We describe ways for reducing the communication during the computation of the left and right reflectors, needed to compute the singular vectors, and introduce a method for estimating the block-sizes needed to hide the communication on parallel file systems.
CITATION STYLE
Carrillo-Cabada, H., Skau, E., Chennupati, G., Alexandrov, B., & Djidjev, H. (2020). An out of Memory tSVD for Big-Data Factorization. IEEE Access, 8, 107749–107759. https://doi.org/10.1109/ACCESS.2020.3000508
Mendeley helps you to discover research relevant for your work.