Locality Sensitive Hash Aggregated Nonlinear Neighborhood Matrix Factorization for Online Sparse Big Data Analysis

Zixuan Li; Hao Li; Kenli Li; Fan Wu; Lydia Chen; Keqin Li

Journal ArticleOPEN ACCESS

Locality Sensitive Hash Aggregated Nonlinear Neighborhood Matrix Factorization for Online Sparse Big Data Analysis

Li Z
Li H
Li K
et al.

ACM/IMS Transactions on Data Science (2021) 2(4) 1-27

DOI: 10.1145/3497749

N/ACitations

5Readers

Abstract

Matrix factorization (MF) can extract the low-rank features and integrate the information of the data manifold distribution from high-dimensional data, which can consider the nonlinear neighborhood information. Thus, MF has drawn wide attention for low-rank analysis of sparse big data, e.g., Collaborative Filtering (CF) Recommender Systems, Social Networks, and Quality of Service. However, the following two problems exist: (1) huge computational overhead for the construction of the Graph Similarity Matrix (GSM) and (2) huge memory overhead for the intermediate GSM. Therefore, GSM-based MF, e.g., kernel MF, graph regularized MF, and so on, cannot be directly applied to the low-rank analysis of sparse big data on cloud and edge platforms. To solve this intractable problem for sparse big data analysis, we propose Locality Sensitive Hashing (LSH) aggregated MF (LSH-MF), which can solve the following problems: (1) The proposed probabilistic projection strategy of LSH-MF can avoid the construction of the GSM. Furthermore, LSH-MF can satisfy the requirement for the accurate projection of sparse big data. (2) To run LSH-MF for fine-grained parallelization and online learning on GPUs, we also propose CULSH-MF, which works on CUDA parallelization. Experimental results show that CULSH-MF can not only reduce the computational time and memory overhead but also obtain higher accuracy. Compared with deep learning models, CULSH-MF can not only save training time but also achieve the same accuracy performance.

Cite

CITATION STYLE

APA

Li, Z., Li, H., Li, K., Wu, F., Chen, L., & Li, K. (2021). Locality Sensitive Hash Aggregated Nonlinear Neighborhood Matrix Factorization for Online Sparse Big Data Analysis. ACM/IMS Transactions on Data Science, 2(4), 1–27. https://doi.org/10.1145/3497749

Locality Sensitive Hash Aggregated Nonlinear Neighborhood Matrix Factorization for Online Sparse Big Data Analysis

Abstract

Cite

Register to see more suggestions