VASE: Variational Assorted Surprise Exploration for Reinforcement Learning

Haitao Xu; Lech Szymanski; Brendan McCane

Journal Article

VASE: Variational Assorted Surprise Exploration for Reinforcement Learning

IEEE Transactions on Neural Networks and Learning Systems (2023) 34(3) 1243-1252

DOI: 10.1109/TNNLS.2021.3105140

4Citations

17Readers

Get full text

Abstract

Exploration in environments with continuous control and sparse rewards remains a key challenge in reinforcement learning (RL). One of the approaches to encourage more systematic and efficient exploration relies on surprise as an intrinsic reward for the agent. We introduce a new definition of surprise and its RL implementation named variational assorted surprise exploration (VASE). VASE uses a Bayesian neural network as a model of the environment dynamics and is trained using variational inference, alternately updating the accuracy of the agent's model and policy. Our experiments show that in continuous control sparse reward environments, VASE outperforms other surprise-based exploration techniques.

Author supplied keywords

Cite

CITATION STYLE

APA

Xu, H., Szymanski, L., & McCane, B. (2023). VASE: Variational Assorted Surprise Exploration for Reinforcement Learning. IEEE Transactions on Neural Networks and Learning Systems, 34(3), 1243–1252. https://doi.org/10.1109/TNNLS.2021.3105140

VASE: Variational Assorted Surprise Exploration for Reinforcement Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions