Recommender systems commonly train on user engagements because of their abundance, immediacy of feedback, and the insights they provide into users preferences. However, this approach may unintentionally prioritize optimizing short-term engagements over a product's or business's long-term objectives. At Netflix, our recommender systems are designed with the goal of maximizing long-term member satisfaction. To achieve this objective, we adopt a practical approach that augments engagement data with reward signals aligned with long term member satisfaction. This process of identifying, evaluating, and integrating reward signals into an existing learning algorithm is what we term reward innovation. In this work, we present the challenges of applying this approach to a large-scale recommender system and share our approach to addressing them.
CITATION STYLE
Tang, G., Pan, J., Wang, H., & Basilico, J. (2023). Reward innovation for long-term member satisfaction. In Proceedings of the 17th ACM Conference on Recommender Systems, RecSys 2023 (pp. 396–399). Association for Computing Machinery, Inc. https://doi.org/10.1145/3604915.3608873
Mendeley helps you to discover research relevant for your work.