Hanabi is a multiplayer cooperative card game, where only your partners know your cards. All players succeed or fail together. This makes the game an excellent testbed for studying collaboration. Recently, it has been shown that deep neural networks can be trained through self-play to play the game very well. However, such agents generally do not play well with others. In this paper, we investigate the consequences of training Rainbow DQN agents with human-inspired rule-based agents. We analyze with which agents Rainbow agents learn to play well, and how well playing skill transfers to agents they were not trained with. We also analyze patterns of communication between agents to elucidate how collaboration happens. A key finding is that while most agents only learn to play well with partners seen during training, one particular agent leads the Rainbow algorithm towards a much more general policy. The metrics and hypotheses advanced in this paper can be used for further study of collaborative agents.
CITATION STYLE
Canaan, R., Gao, X., Chung, Y., Togelius, J., Nealen, A., & Menzel, S. (2020). Behavioral evaluation of hanabi rainbow DQN agents and rule-based agents. In Proceedings of the 16th AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, AIIDE 2020 (pp. 31–37). The AAAI Press. https://doi.org/10.1609/aiide.v16i1.7404
Mendeley helps you to discover research relevant for your work.