Reinforcement Learning with Stochastic Reward Machines

Jan Corazza; Ivan Gavran; Daniel Neider

Conference ProceedingsOPEN ACCESS

Reinforcement Learning with Stochastic Reward Machines

Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 (2022) 36 6429-6436

DOI: 10.1609/aaai.v36i6.20594

25Citations

24Readers

Abstract

Reward machines are an established tool for dealing with reinforcement learning problems in which rewards are sparse and depend on complex sequences of actions. However, existing algorithms for learning reward machines assume an overly idealized setting where rewards have to be free of noise. To overcome this practical limitation, we introduce a novel type of reward machines, called stochastic reward machines, and an algorithm for learning them. Our algorithm, based on constraint solving, learns minimal stochastic reward machines from the explorations of a reinforcement learning agent. This algorithm can easily be paired with existing reinforcement learning algorithms for reward machines and guarantees to converge to an optimal policy in the limit. We demonstrate the effectiveness of our algorithm in two case studies and show that it outperforms both existing methods and a naive approach for handling noisy reward functions.

Cite

CITATION STYLE

APA

Corazza, J., Gavran, I., & Neider, D. (2022). Reinforcement Learning with Stochastic Reward Machines. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022 (Vol. 36, pp. 6429–6436). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v36i6.20594

Reinforcement Learning with Stochastic Reward Machines

Abstract

Cite

Register to see more suggestions