This work concerns discrete-time average Markov decision chains on a denumerable state space. Besides standard continuity compactness requirements, the main structural condition on the model is that the cost function has a Lyapunov function ℓ and that a power larger than two of ℓ also admits a Lyapunov function. In this context, the existence of optimal stationary policies in the (strong) sample-path sense is established, and it is shown that the Markov policies obtained from methods commonly used to approximate a solution of the optimality equation are also sample-path average optimal.
CITATION STYLE
Cavazos-Cadena, R., & Montes-de-Oca, R. (2012). Sample-path optimality in average Markov decision chains under a double Lyapunov function condition. In Systems and Control: Foundations and Applications (pp. 31–57). Birkhauser. https://doi.org/10.1007/978-0-8176-8337-5_3
Mendeley helps you to discover research relevant for your work.