A major challenge for collaborative filtering (CF) techniques in recommender systems is the data sparsity that is caused by missing and noisy ratings. This problem is even more serious for CF domains where the ratings are expressed numerically, e.g. as 5-star grades. We assume the 5-star ratings are unordered bins instead of ordinal relative preferences. We observe that, while we may lack the information in numerical ratings, we sometimes have additional auxiliary data in the form of binary ratings. This is especially true given that users can easily express themselves with their preferences expressed as likes or dislikes for items. In this paper, we explore how to use these binary auxiliary preference data to help reduce the impact of data sparsity for CF domains expressed in numerical ratings. We solve this problem by transferring the rating knowledge from some auxiliary data source in binary form (that is, likes or dislikes), to a target numerical rating matrix. In particular, our solution is to model both the numerical ratings and ratings expressed as like or dislike in a principled way. We present a novel framework of Transfer by Collective Factorization (TCF), in which we construct a shared latent space collectively and learn the data-dependent effect separately. A major advantage of the TCF approach over the previous bilinear method of collective matrix factorization is that we are able to capture the data-dependent effect when sharing the data-independent knowledge. This allows us to increase the overall quality of knowledge transfer. We present extensive experimental results to demonstrate the effectiveness of TCF at various sparsity levels, and show improvements of our approach as compared to several state-of-the-art methods. © 2013 Elsevier B.V.
Pan, W., & Yang, Q. (2013). Transfer learning in heterogeneous collaborative filtering domains. Artificial Intelligence, 197, 39–55. https://doi.org/10.1016/j.artint.2013.01.003