On minimizing ordered weighted regrets in multiobjective Markov decision processes

Wlodzimierz Ogryczak; Patrice Perny; Paul Weng

Conference Proceedings

On minimizing ordered weighted regrets in multiobjective Markov decision processes

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6992 LNAI 190-204

DOI: 10.1007/978-3-642-24873-3_15

13Citations

13Readers

Get full text

Abstract

In this paper, we propose an exact solution method to generate fair policies in Multiobjective Markov Decision Processes (MMDPs). MMDPs consider n immediate reward functions, representing either individual payoffs in a multiagent problem or rewards with respect to different objectives. In this context, we focus on the determination of a policy that fairly shares regrets among agents or objectives, the regret being defined on each dimension as the opportunity loss with respect to optimal expected rewards. To this end, we propose to minimize the ordered weighted average of regrets (OWR). The OWR criterion indeed extends the minimax regret, relaxing egalitarianism for a milder notion of fairness. After showing that OWR-optimality is state-dependent and that the Bellman principle does not hold for OWR-optimal policies, we propose a linear programming reformulation of the problem. We also provide experimental results showing the efficiency of our approach. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Ogryczak, W., Perny, P., & Weng, P. (2011). On minimizing ordered weighted regrets in multiobjective Markov decision processes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6992 LNAI, pp. 190–204). https://doi.org/10.1007/978-3-642-24873-3_15

On minimizing ordered weighted regrets in multiobjective Markov decision processes

Abstract

Author supplied keywords

Cite

Register to see more suggestions