The factorization-based models have achieved great success in online advertisements and recommender systems due to the capability of efficiently modeling combinational features. These models encode feature interactions by the vector product between feature embedding. Despite the improvement of generalization, the memory consumption of these models grows significantly, because they usually take hundreds to thousands of large categorical features as input. Several existing works try to reduce the memory footprint by hashing, randomized embedding composition, and dimensionality search, but they suffer from either substantial performance degradation or limited memory compression. To this end, in this paper, we propose an extremely memory-efficient Factorization Machine (xLightFM), where each category embedding is composited with latent vectors selected from codebooks. Based on the characteristics of each categorical feature, we further propose to adapt the codebook size with the neural architecture search techniques for compositing the embedding of each categorical feature. This further pushes the limits of memory compression while incurring negligible degradation or even some improvements in prediction performance. We extensively evaluate the proposed algorithm with two real-world datasets. The results demonstrate that xLightFM can outperform the state-of-the-art lightweight factorization-based methods in terms of both prediction quality and memory footprint, and achieve more than 18x and 27x memory compression compared to the vanilla FM on these two datasets, respectively.
CITATION STYLE
Jiang, G., Wang, H., Chen, J., Wang, H., Lian, D., & Chen, E. (2021). XLightFM: Extremely Memory-Efficient Factorization Machine. In SIGIR 2021 - Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 337–346). Association for Computing Machinery, Inc. https://doi.org/10.1145/3404835.3462941
Mendeley helps you to discover research relevant for your work.