Abstract
Effective exploration is believed to positively influence the long-Term user experience on recommendation platforms. Determining its exact benefits, however, has been challenging. Regular A/B tests on exploration often measure neutral or even negative engagement metrics while failing to capture its long-Term benefits. We here introduce new experiment designs to formally quantify the long-Term value of exploration by examining its effects on content corpus, and connecting content corpus growth to the long-Term user experience from real-world experiments. Once established the values of exploration, we investigate the Neural Linear Bandit algorithm as a general framework to introduce exploration into any deep learning based ranking systems. We conduct live experiments on one of the largest short-form video recommendation platforms that serves billions of users to validate the new experiment designs, quantify the long-Term values of exploration, and to verify the effectiveness of the adopted neural linear bandit algorithm for exploration.
Author supplied keywords
Cite
CITATION STYLE
Su, Y., Wang, X., Le, E. Y., Liu, L., Li, Y., Lu, H., … Chen, M. (2024). Long-Term Value of Exploration: Measurements, Findings and Algorithms. In WSDM 2024 - Proceedings of the 17th ACM International Conference on Web Search and Data Mining (pp. 636–644). Association for Computing Machinery, Inc. https://doi.org/10.1145/3616855.3635833
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.