A non-stationary infinite partially-observable markov decision process

Sotirios P. Chatzis; Dimitrios Kosmopoulos

Conference Proceedings

A non-stationary infinite partially-observable markov decision process

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8681 LNCS 355-362

DOI: 10.1007/978-3-319-11179-7_45

1Citations

10Readers

Get full text

Abstract

Partially Observable Markov Decision Processes (POMDPs) have been met with great success in planning domains where agents must balance actions that provide knowledge and actions that provide reward. Recently, nonparametric Bayesian methods have been successfully applied to POMDPs to obviate the need of a priori knowledge of the size of the state space, allowing to assume that the number of visited states may grow as the agent explores its environment. These approaches rely on the assumption that the agent's environment remains stationary; however, in real-world scenarios the environment may change over time. In this work, we aim to address this inadequacy by introducing a dynamic nonparametric Bayesian POMDP model that both allows for automatic inference of the (distributional) representations of POMDP states, and for capturing non-stationarity in the modeled environments. Formulation of our method is based on imposition of a suitable dynamic hierarchical Dirichlet process (dHDP) prior over state transitions. We derive efficient algorithms for model inference and action planning and evaluate it on several benchmark tasks. © 2014 Springer International Publishing Switzerland.

Cite

CITATION STYLE

APA

Chatzis, S. P., & Kosmopoulos, D. (2014). A non-stationary infinite partially-observable markov decision process. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8681 LNCS, pp. 355–362). Springer Verlag. https://doi.org/10.1007/978-3-319-11179-7_45

A non-stationary infinite partially-observable markov decision process

Abstract

Cite

Register to see more suggestions