Abstract
In this work, we study reinforcement learning (RL) in solving text-based games. We address the challenge of combinatorial action space, by proposing a confidence-based self-imitation model to generate action candidates for the RL agent. Firstly, we leverage the self-imitation learning to rank and exploit past valuable trajectories to adapt a pre-trained language model (LM) towards a target game. Then, we devise a confidence-based strategy to measure the LM's confidence with respect to a state, thus adaptively pruning the generated actions to yield a more compact set of action candidates. In multiple challenging games, our model demonstrates promising performance in comparison to the baselines.
Cite
CITATION STYLE
Shi, Z., Xu, Y., Fang, M., & Chen, L. (2023). Self-imitation Learning for Action Generation in Text-based Games. In EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings of the Conference (pp. 703–726). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.eacl-main.50
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.