Reduction of discounted continuous-time MDPs with unbounded jump and reward rates to discrete-time total-reward MDPs

Eugene A. Feinberg

Book Chapter

Reduction of discounted continuous-time MDPs with unbounded jump and reward rates to discrete-time total-reward MDPs

Feinberg E

Birkhauser, (2012), 77-97

DOI: 10.1007/978-0-8176-8337-5_5

24Citations

3Readers

Get full text

Abstract

This chapter discusses a reduction of discounted continuous-time Markov decision processes (CTMDPs) to discrete-time Markov decision processes (MDPs). This reduction is based on the equivalence of a randomized policy that chooses actions only at jump epochs to a nonrandomized policy that can switch actions between jumps. For discounted CTMDPs with bounded jump rates, this reduction was introduced by the author in 2004 as a reduction to discounted MDPs. Here we show that this reduction also holds for unbounded jump and reward rates, but the corresponding MDP may not be discounted. However, the analysis of the equivalent total-reward MDP leads to the description of optimal policies for the CTMDP and provides methods for their computation.

Author supplied keywords

Cite

CITATION STYLE

APA

Feinberg, E. A. (2012). Reduction of discounted continuous-time MDPs with unbounded jump and reward rates to discrete-time total-reward MDPs. In Systems and Control: Foundations and Applications (pp. 77–97). Birkhauser. https://doi.org/10.1007/978-0-8176-8337-5_5

Reduction of discounted continuous-time MDPs with unbounded jump and reward rates to discrete-time total-reward MDPs

Abstract

Author supplied keywords

Cite

Register to see more suggestions