Markov Decision Processes with Fuzzy Risk-Sensitive Rewards: The Best Coherent Risk Measures Under Risk Averse Utilities

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Risk-sensitive Markov decision processes with risk constraints are discussed using the best coherent risk measures under risk averse utility. The coherent risk measures are represented as weighted average value-at-risks with the most adapted risk spectrum derived from decision maker’s risk averse utility, and then the risk spectrum inherits the risk averse property of the decision maker’s utility as weighting. Risk-sensitive expected rewards are also approximated by the derived weighted average value-at-risks. By perception-based extension, Markov decision processes are formulated for fuzzy random variables. Firstly, to find feasible ranges of risk levels, a risk-minimizing problem is discussed by mathematical programming. Next the maximization of risk-sensitive running rewards under the feasible risk constraints is discussed by dynamic programming. While, in case of the maximization of risk-sensitive terminal rewards in long terms, a sufficient condition for numerical computation of the solutions is given. A few numerical examples are given to understand the obtained results and several figures are shown to illustrate the details.

Cite

CITATION STYLE

APA

Yoshida, Y. (2021). Markov Decision Processes with Fuzzy Risk-Sensitive Rewards: The Best Coherent Risk Measures Under Risk Averse Utilities. In Studies in Computational Intelligence (Vol. 922, pp. 135–161). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-70594-7_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free