Safe and sample-efficient reinforcement learning algorithms for factored environments

Thiago D. Simão

Conference Proceedings

Safe and sample-efficient reinforcement learning algorithms for factored environments

Simão T

IJCAI International Joint Conference on Artificial Intelligence (2019) 2019-August 6460-6461

DOI: 10.24963/ijcai.2019/919

0Citations

17Readers

Get full text

Abstract

Reinforcement Learning (RL) deals with problems that can be modeled as a Markov Decision Process (MDP) where the transition function is unknown. In situations where an arbitrary policy π is already in execution and the experiences with the environment were recorded in a batch D, an RL algorithm can use D to compute a new policy π0. However, the policy computed by traditional RL algorithms might have worse performance compared to π. Our goal is to develop safe RL algorithms, where the agent has a high confidence that the performance of π0 is better than the performance of π given D. To develop sample-efficient and safe RL algorithms we combine ideas from exploration strategies in RL with a safe policy improvement method.

Cite

CITATION STYLE

APA

Simão, T. D. (2019). Safe and sample-efficient reinforcement learning algorithms for factored environments. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2019-August, pp. 6460–6461). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/919

Safe and sample-efficient reinforcement learning algorithms for factored environments

Abstract

Cite

Register to see more suggestions