Sample complexity and performance bounds for non-parametric approximate linear programming

Jason Pazis; Ronald Parr

Conference ProceedingsOPEN ACCESS

Sample complexity and performance bounds for non-parametric approximate linear programming

Proceedings of the 27th AAAI Conference on Artificial Intelligence, AAAI 2013 (2013) 782-788

DOI: 10.1609/aaai.v27i1.8696

1Citations

11Readers

Abstract

One of the most difficult tasks in value function approximation for Markov Decision Processes is finding an approximation architecture that is expressive enough to capture the important structure in the value function, while at the same time not overfitting the training samples. Recent results in nonparametric approximate linear programming (NP-ALP), have demonstrated that this can be done effectively using nothing more than a smoothness assumption on the value function. In this paper we extend these results to the case where samples come from real world transitions instead of the full Bellman equation, adding robustness to noise. In addition, we provide the first max-norm, finite sample performance guarantees for any form of ALP. NP-ALP is amenable to problems with large (multidimensional) or even infinite (continuous) action spaces, and does not require a model to select actions using the resulting approximate solution. Copyright © 2013, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

Cite

CITATION STYLE

APA

Pazis, J., & Parr, R. (2013). Sample complexity and performance bounds for non-parametric approximate linear programming. In Proceedings of the 27th AAAI Conference on Artificial Intelligence, AAAI 2013 (pp. 782–788). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/aaai.v27i1.8696

Sample complexity and performance bounds for non-parametric approximate linear programming

Abstract

Cite

Register to see more suggestions