In implicit feedback-based recommender systems, user exposure data, which record whether or not a recommended item has been interacted by a user, provide an important clue on selecting negative training samples. In this work, we improve the negative sampler by integrating the exposure data. We propose to generate high-quality negative instances by adversarial training to favour the difficult instances, and by optimizing additional objective to favour the real negatives in exposure data. However, this idea is non-trivial to implement since the distribution of exposure data is latent and the item space is discrete. To this end, we design a novel RNS method (short for Reinforced Negative Sampler) that generates exposure-alike negative instances through feature matching technique instead of directly choosing from exposure data. Optimized under the reinforcement learning framework, RNS is able to integrate user preference signals in exposure data and hard negatives. Extensive experiments on two real-world datasets demonstrate the effectiveness and rationality of our RNS method. Our implementation is available at: https://github.com/dingjingtao/ReinforceNS.
CITATION STYLE
Ding, J., Quan, Y., He, X., Li, Y., & Jin, D. (2019). Reinforced negative sampling for recommendation with exposure data. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2019-August, pp. 2230–2236). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/309
Mendeley helps you to discover research relevant for your work.