Sample-path optimality in average Markov decision chains under a double Lyapunov function condition

3Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This work concerns discrete-time average Markov decision chains on a denumerable state space. Besides standard continuity compactness requirements, the main structural condition on the model is that the cost function has a Lyapunov function ℓ and that a power larger than two of ℓ also admits a Lyapunov function. In this context, the existence of optimal stationary policies in the (strong) sample-path sense is established, and it is shown that the Markov policies obtained from methods commonly used to approximate a solution of the optimality equation are also sample-path average optimal.

Cite

CITATION STYLE

APA

Cavazos-Cadena, R., & Montes-de-Oca, R. (2012). Sample-path optimality in average Markov decision chains under a double Lyapunov function condition. In Systems and Control: Foundations and Applications (pp. 31–57). Birkhauser. https://doi.org/10.1007/978-0-8176-8337-5_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free