In this paper we consider an optimal control problem for partially observable Markov decision processes with finite states, signals and actions OVE,r an infinite horizon. It is shown that there are €-optimal piecewise·linear value functions and piecl~wise-constant policies which are simple. Simple means that there are only finitely many pieces, each of which is defined on a convex polyhedral set. An algorithm based on the method of successive approximation is developed to compute €-optimal policy and €·optimal cost. Furthermore, a special class of stationary policies, called finitely transient, will be considered. It will be shown that such policies have attractive properties which enable us to convert a partially observable Markov decision chain into a usual finite state Markov one.
CITATION STYLE
Sawaki, K., & Ichikawa, A. (1978). OPTIMAL CONTROL FOR PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES OVER AN INFINITE HORIZON. Journal of the Operations Research Society of Japan, 21(1), 1–16. https://doi.org/10.15807/jorsj.21.1
Mendeley helps you to discover research relevant for your work.