Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems

Tianchi Cai; Shenliao Bao; Jiyan Jiang; Shiji Zhou; Wenpeng Zhang; Lihong Gu; Jinjie Gu; Guannan Zhang

Conference ProceedingsOPEN ACCESS

Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems

SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (2023) 2179-2183

DOI: 10.1145/3539618.3592022

0Citations

7Readers

Get full text

Abstract

Model-free RL-based recommender systems have recently received increasing research attention due to their capability to handle partial feedback and long-term rewards. However, most existing research has ignored a critical feature in recommender systems: one user's feedback on the same item at different times is random. The stochastic rewards property essentially differs from that in classic RL scenarios with deterministic rewards, which makes RL-based recommender systems much more challenging. In this paper, we first demonstrate in a simulator environment where using direct stochastic feedback results in a significant drop in performance. Then to handle the stochastic feedback more efficiently, we design two stochastic reward stabilization frameworks that replace the direct stochastic feedback with that learned by a supervised model. Both frameworks are model-agnostic, i.e., they can effectively utilize various supervised models. We demonstrate the superiority of the proposed frameworks over different RL-based recommendation baselines with extensive experiments on a recommendation simulator as well as an industrial-level recommender system.

Author supplied keywords

Cite

CITATION STYLE

APA

Cai, T., Bao, S., Jiang, J., Zhou, S., Zhang, W., Gu, L., … Zhang, G. (2023). Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems. In SIGIR 2023 - Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 2179–2183). Association for Computing Machinery, Inc. https://doi.org/10.1145/3539618.3592022

Model-free Reinforcement Learning with Stochastic Reward Stabilization for Recommender Systems

Abstract

Author supplied keywords

Cite

Register to see more suggestions