ParaSketch: Parallel tensor factorization via sketching

18Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

Abstract

Tensor factorization methods have gained increased popularity in the data mining community. A key feature that renders tensors attractive is the essential uniqueness (identifiability) of their decomposition into latent factors: this is crucial for explanatory data analysis-model uniqueness makes interpretations well grounded. In this work, we propose ParaSketch, a distributed tensor factorization algorithm that enables massive parallelism, to deal with large tensors. The idea is to compress/sketch the large tensor into multiple small tensors, decompose each small tensor, and combine the results to reconstruct the desired latent factors. Prior art in this direction entails potentially very high complexity in the (Gaussian) compression and final combining stages. Utilizing sketching matrices for compression, the proposed method greatly reduces compression complexity, and features much simpler combining. Moreover, theoretical analysis shows that the compressed tensors inherit latent identifiability under mild conditions, hence establishing correctness of the overall approach. Our approach to establish identifiability for the sketched tensor is original, and of interest in its own right.

Cite

CITATION STYLE

APA

Yang, B., Zamzam, A., & Sidiropoulos, N. D. (2018). ParaSketch: Parallel tensor factorization via sketching. In SIAM International Conference on Data Mining, SDM 2018 (pp. 396–404). Society for Industrial and Applied Mathematics Publications. https://doi.org/10.1137/1.9781611975321.45

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free