Markov decision processes with multiple objectives

Krishnendu Chatterjee; Rupak Majumdar; Thomas A. Henzinger

Conference Proceedings

Markov decision processes with multiple objectives

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 3884 LNCS 325-336

DOI: 10.1007/11672142_26

107Citations

52Readers

Get full text

Abstract

We consider Markov decision processes (MDPs) with multiple discounted reward objectives. Such MDPs occur in design problems where one wishes to simultaneously optimize several criteria, for example, latency and power. The possible trade-offs between the different objectives are characterized by the Pareto curve. We show that every Pareto-optimal point can be achieved by a memoryless strategy; however, unlike in the single-objective case, the memoryless strategy may require randomization. Moreover, we show that the Pareto curve can be approximated in polynomial time in the size of the MDP. Additionally, we study the problem if a given value vector is realizable by any strategy, and show that it can be decided in polynomial time; but the question whether it is realizable by a deterministic memoryless strategy is NP-complete. These results provide efficient algorithms for design exploration in MDP models with multiple objectives. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Chatterjee, K., Majumdar, R., & Henzinger, T. A. (2006). Markov decision processes with multiple objectives. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3884 LNCS, pp. 325–336). https://doi.org/10.1007/11672142_26

Markov decision processes with multiple objectives

Abstract

Cite

Register to see more suggestions