On overfitting and asymptotic bias in batch reinforcement learning with partial observability

Vincent François-Lavet; Guillaume Rabusseau; Joelle Pineau; Damien Ernst; Raphael Fonteneau

Conference ProceedingsOPEN ACCESS

On overfitting and asymptotic bias in batch reinforcement learning with partial observability

IJCAI International Joint Conference on Artificial Intelligence (2020) 2021-January 5055-5059

ISSN: 10450823

1Citations

31Readers

Abstract

When an agent has limited information on its environment, the suboptimality of an RL algorithm can be decomposed into the sum of two terms: a term related to an asymptotic bias (suboptimality with unlimited data) and a term due to overfitting (additional suboptimality due to limited data). In the context of reinforcement learning with partial observability, this paper provides an analysis of the tradeoff between these two sources of error. In particular, our theoretical analysis formally characterizes how a smaller state representation increases the asymptotic bias while decreasing the risk of overfitting.

Cite

CITATION STYLE

APA

François-Lavet, V., Rabusseau, G., Pineau, J., Ernst, D., & Fonteneau, R. (2020). On overfitting and asymptotic bias in batch reinforcement learning with partial observability. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2021-January, pp. 5055–5059). International Joint Conferences on Artificial Intelligence.

On overfitting and asymptotic bias in batch reinforcement learning with partial observability

Abstract

Cite

Register to see more suggestions