Towards better interpretability in deep Q-networks

Raghuram Mandyam Annasamy; Katia Sycara

Conference ProceedingsOPEN ACCESS

Towards better interpretability in deep Q-networks

33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019 (2019) 4561-4569

DOI: 10.1609/aaai.v33i01.33014561

56Citations

76Readers

Abstract

Deep reinforcement learning techniques have demonstrated superior performance in a wide variety of environments. As improvements in training algorithms continue at a brisk pace, theoretical or empirical studies on understanding what these networks seem to learn, are far behind. In this paper we propose an interpretable neural network architecture for Q-learning which provides a global explanation of the model's behavior using key-value memories, attention and reconstructible embeddings. With a directed exploration strategy, our model can reach training rewards comparable to the state-of-the-art deep Q-learning models. However, results suggest that the features extracted by the neural network are extremely shallow and subsequent testing using out-of-sample examples shows that the agent can easily overfit to trajectories seen during training.

Cite

CITATION STYLE

APA

Annasamy, R. M., & Sycara, K. (2019). Towards better interpretability in deep Q-networks. In 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019 (pp. 4561–4569). AAAI Press. https://doi.org/10.1609/aaai.v33i01.33014561

Towards better interpretability in deep Q-networks

Abstract

Cite

Register to see more suggestions