Using contextual bandits with behavioral constraints for constrained online movie recommendation

33Citations
Citations of this article
33Readers
Mendeley users who have this article in their library.

Abstract

AI systems that learn through reward feedback about the actions they take are increasingly deployed in domains that have significant impact on our daily life. In many cases the rewards should not be the only guiding criteria, as there are additional constraints and/or priorities imposed by regulations, values, preferences, or ethical principles. We detail a novel online system, based on an extension of the contextual bandits framework, that learns a set of behavioral constraints by observation and uses these constraints as a guide when making decisions in an online setting while still being reactive to reward feedback. In addition, our system can highlight features of the context which are more predicted to be more rewarding and/or are in line with the behavioral constraints. We demonstrate the system by building an interactive interface for an online movie recommendation agent and show that our system is able to act within a set of behavior constraints without significantly degrading overall performance.

Cite

CITATION STYLE

APA

Balakrishnan, A., Bouneffouf, D., Mattei, N., & Rossi, F. (2018). Using contextual bandits with behavioral constraints for constrained online movie recommendation. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2018-July, pp. 5802–5804). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2018/843

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free