OPTIMAL CONTROL FOR PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES OVER AN INFINITE HORIZON

  • Sawaki K
  • Ichikawa A
N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

In this paper we consider an optimal control problem for partially observable Markov decision processes with finite states, signals and actions OVE,r an infinite horizon. It is shown that there are €-optimal piecewise·linear value functions and piecl~wise-constant policies which are simple. Simple means that there are only finitely many pieces, each of which is defined on a convex polyhedral set. An algorithm based on the method of successive approximation is developed to compute €-optimal policy and €·optimal cost. Furthermore, a special class of stationary policies, called finitely transient, will be considered. It will be shown that such policies have attractive properties which enable us to convert a partially observable Markov decision chain into a usual finite state Markov one.

Cite

CITATION STYLE

APA

Sawaki, K., & Ichikawa, A. (1978). OPTIMAL CONTROL FOR PARTIALLY OBSERVABLE MARKOV DECISION PROCESSES OVER AN INFINITE HORIZON. Journal of the Operations Research Society of Japan, 21(1), 1–16. https://doi.org/10.15807/jorsj.21.1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free