Polynomial time algorithms for branching Markov decision processes and probabilistic min(max) polynomial Bellman equations

7Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We show that one can approximate the least fixed point solution for a multivariate system of monotone probabilistic max (min) polynomial equations, in time polynomial in both the encoding size of the system of equations and in log(1/ε), where ε > 0 is the desired additive error bound of the solution. (The model of computation is the standard Turing machine model.) These equations form the Bellman optimality equations for several important classes of infinite-state Markov Decision Processes (MDPs). Thus, as a corollary, we obtain the first polynomial time algorithms for computing to within arbitrary desired precision the optimal value vector for several classes of infinite-state MDPs which arise as extensions of classic, and heavily studied, purely stochastic processes. These include both the problem of maximizing and minimizing the termination (extinction) probability of multi-type branching MDPs, stochastic context-free MDPs, and 1-exit Recursive MDPs. We also show that we can compute in P-time an ε-optimal policy for any given desired ε > 0. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Etessami, K., Stewart, A., & Yannakakis, M. (2012). Polynomial time algorithms for branching Markov decision processes and probabilistic min(max) polynomial Bellman equations. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7391 LNCS, pp. 314–326). https://doi.org/10.1007/978-3-642-31594-7_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free