Planning and learning in environments with delayed feedback

Thomas J. Walsh; Ali Nouri; Hong Li; Michael L. Littman

Conference Proceedings

Planning and learning in environments with delayed feedback

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4701 LNAI 442-453

DOI: 10.1007/978-3-540-74958-5_41

17Citations

8Readers

Get full text

Abstract

This work considers the problems of planning and learning in environments with constant observation and reward delays. We provide a hardness result for the general planning problem and positive results for several special cases with deterministic or otherwise constrained dynamics. We present an algorithm, Model Based Simulation, for planning in such environments and use model-based reinforcement learning to extend this approach to the learning setting in both finite and continuous environments. Empirical comparisons show this algorithm holds significant advantages over others for decision making in delayed environments. © Springer-Verlag Berlin Heidelberg 2007.

Cite

CITATION STYLE

APA

Walsh, T. J., Nouri, A., Li, H., & Littman, M. L. (2007). Planning and learning in environments with delayed feedback. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4701 LNAI, pp. 442–453). Springer Verlag. https://doi.org/10.1007/978-3-540-74958-5_41

Planning and learning in environments with delayed feedback

Abstract

Cite

Register to see more suggestions