In a zero-sum limiting average stochastic game, we evaluate a strategy π for the maximizing player, player 1, by the reward φs(π) that π guarantees to him when starting in state s. A strategy π is called non-improving if φs(π)≥φs(π[h]) for any state s and for any finite history h, where π[h] is the strategy π conditional on the history h; otherwise the strategy is called improving. We investigate the use of improving and non-improving strategies, and explore the relation between (non-)improvingness and (ε-)optimality. Improving strategies appear to play a very important role for obtaining ε-optimality, while 0-optimal strategies are always non-improving. Several examples will clarify all these issues.
CITATION STYLE
Flesch, J., Thuijsman, F., & Vrieze, O. J. (1998). Improving strategies in stochastic games. Proceedings of the IEEE Conference on Decision and Control, 3, 2674–2679. https://doi.org/10.1109/cdc.1998.757857
Mendeley helps you to discover research relevant for your work.