The cyclic nature of the recommendation task is being increasingly taken into account in recommender systems research. In this line, framing interactive recommendation as a genuine reinforcement learning problem, multi-armed bandit approaches have been increasingly considered as a means to cope with the dual exploitation/exploration goal of recommendation. In this paper we develop a simple multi-armed bandit elaboration of neighbor-based collaborative filtering. The approach can be seen as a variant of the nearest-neighbors scheme, but endowed with a controlled stochastic exploration capability of the users' neighborhood, by a parameter-free application of Thompson sampling. Our approach is based on a formal development and a reasonably simple design, whereby it aims to be easy to reproduce and further elaborate upon. We report experiments using datasets from different domains showing that neighbor-based bandits indeed achieve recommendation accuracy enhancements in the mid to long run.
CITATION STYLE
Sanz-Cruzado, J., Castells, P., & López, E. (2019). A simple multi-armed nearest-neighbor bandit for interactive recommendation. In RecSys 2019 - 13th ACM Conference on Recommender Systems (pp. 358–362). Association for Computing Machinery, Inc. https://doi.org/10.1145/3298689.3347040
Mendeley helps you to discover research relevant for your work.