Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

Sean R. Sinclair; Siddhartha Banerjee; Christina Lee Yu

Journal ArticleOPEN ACCESS

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

Performance Evaluation Review (2020) 48(1) 17-18

DOI: 10.1145/3393691.3394176

1Citations

19Readers

Get full text

Abstract

We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces. Our algorithm is based on a novel Q-learning policy with adaptive data-driven discretization. The central idea is to maintain a finer partition of the state-action space in regions which are frequently visited in historical trajectories, and have higher payoff estimates. We demonstrate how our adaptive partitions take advantage of the shape of the optimal Q-function and the joint space, without sacrificing the worst-case performance. In particular, we recover the regret guarantees of prior algorithms for continuous state-action spaces, which additionally require either an optimal discretization as input, and/or access to a simulation oracle. Moreover, experiments demonstrate how our algorithm automatically adapts to the underlying structure of the problem, resulting in much better performance compared both to heuristics and Q-learning with uniform discretization.

Author supplied keywords

Cite

CITATION STYLE

APA

Sinclair, S. R., Banerjee, S., & Lee Yu, C. (2020). Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces. Performance Evaluation Review, 48(1), 17–18. https://doi.org/10.1145/3393691.3394176

Adaptive Discretization for Episodic Reinforcement Learning in Metric Spaces

Abstract

Author supplied keywords

Cite

Register to see more suggestions