Balancing Reinforcement Learning Training Experiences in Interactive Information Retrieval

10Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Interactive Information Retrieval (IIR) and Reinforcement Learning (RL) share many commonalities, including an agent who learns while interacts, a long-term and complex goal, and an algorithm that explores and adapts. To successfully apply RL methods to IIR, one challenge is to obtain sufficient relevance labels to train the RL agents, which are infamously known as sample inefficient. However, in a text corpus annotated for a given query, it is not the relevant documents but the irrelevant documents that predominate. This would cause very unbalanced training experiences for the agent and prevent it from learning any policy that is effective. Our paper addresses this issue by using domain randomization to synthesize more relevant documents for the training. Our experimental results on the Text REtrieval Conference (TREC) Dynamic Domain (DD) 2017 Track show that the proposed method is able to boost an RL agent's learning effectiveness by 22% in dealing with unseen situations.

Cite

CITATION STYLE

APA

Chen, L., Tang, Z., & Yang, G. H. (2020). Balancing Reinforcement Learning Training Experiences in Interactive Information Retrieval. In SIGIR 2020 - Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 1525–1528). Association for Computing Machinery, Inc. https://doi.org/10.1145/3397271.3401200

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free