Markov Decision Processes with a New Optimality Criterion: Discrete Time

  • Jaquette S
N/ACitations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

Standard finite state and action discrete time Markov decision pro- cesses with discounting are studied using a new optimality criterion called moment optimality. A policy is moment optimal if it lexicographically maximizes the sequence of signed moments of total discounted return with a positive (negative) sign if the moment is odd (even). This criterion is equivalent to being a little risk adverse. It is shown that a stationary policy is moment optimal by examining the negative of the Laplace trans- form of the total return random variable. An algorithm to construct all stationary moment optimal policies is developed. The algorithm is shown to be finite.

Cite

CITATION STYLE

APA

Jaquette, S. C. (2007). Markov Decision Processes with a New Optimality Criterion: Discrete Time. The Annals of Statistics, 1(3). https://doi.org/10.1214/aos/1176342415

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free