Estimating and penalizing preference shift in recommender systems

11Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recommender systems trained via long-horizon optimization (e.g., reinforcement learning) will have incentives to actively manipulate user preferences through the recommended content. While some work has argued for making systems myopic to avoid this issue, even such systems can induce systematic undesirable preference shifts. Thus, rather than artificially stifling the capabilities of the system, in this work we explore how we can make capable systems that explicitly avoid undesirable shifts. We advocate for (1) estimating the preference shifts that would be induced by recommender system policies, and (2) explicitly characterizing what unwanted shifts are and assessing before deployment whether such policies will produce them-ideally even actively optimizing to avoid them. These steps involve two challenging ingredients: (1) requires the ability to anticipate how hypothetical policies would influence user preferences if deployed; instead, (2) requires metrics to assess whether such influences are manipulative or otherwise unwanted. We study how to do (1) from historical user interaction data by building a user predictive model that implicitly contains their preference dynamics; to address (2), we introduce the notion of a "safe policy", which defines a trust region within which behavior is believed to be safe. We show that recommender systems that optimize for staying in the trust region avoid manipulative behaviors (e.g., changing preferences in ways that make users more predictable), while still generating engagement.

Cite

CITATION STYLE

APA

Carroll, M., Hadfield-Menell, D., Russell, S., & Dragan, A. (2021). Estimating and penalizing preference shift in recommender systems. In RecSys 2021 - 15th ACM Conference on Recommender Systems (pp. 661–667). Association for Computing Machinery, Inc. https://doi.org/10.1145/3460231.3478849

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free