Learning word embeddings from wikipedia for content-based recommender systems

Cataldo Musto; Giovanni Semeraro; Marco de Gemmis; Pasquale Lops

Conference Proceedings

Learning word embeddings from wikipedia for content-based recommender systems

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9626 729-734

DOI: 10.1007/978-3-319-30671-1_60

76Citations

49Readers

Get full text

Abstract

In this paper we present a preliminary investigation towards the adoption of Word Embedding techniques in a content-based recommendation scenario. Specifically, we compared the effectiveness of three widespread approaches as Latent Semantic Indexing, Random Indexing and Word2Vec in the task of learning a vector space representation of both items to be recommended as well as user profiles. To this aim, we developed a content-based recommendation (CBRS) framework which uses textual features extracted from Wikipedia to learn user profiles based on such Word Embeddings, and we evaluated this framework against two state-of-the-art datasets. The experimental results provided interesting insights, since our CBRS based on Word Embeddings showed results comparable to those of well-performing algorithms based on Collaborative Filtering and Matrix Factorization, especially in high-sparsity recommendation scenarios.

Cite

CITATION STYLE

APA

Musto, C., Semeraro, G., de Gemmis, M., & Lops, P. (2016). Learning word embeddings from wikipedia for content-based recommender systems. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9626, pp. 729–734). Springer Verlag. https://doi.org/10.1007/978-3-319-30671-1_60

Learning word embeddings from wikipedia for content-based recommender systems

Abstract

Cite

Register to see more suggestions