We extend the potential-based shapingmethod fromMarkov decision processes to multi-player general-sum stochastic games. We prove that the Nash equilibria in a stochastic game remains unchanged after potential-based shaping is applied to the environment. The property of policy invariance provides a possible way of speeding convergence when learning to play a stochastic game. © 2011 AI Access Foundation. All rights reserved.
CITATION STYLE
Lu, X., Schwartz, H. M., & Givigi, S. N. (2011). Research note: Policy invariance under reward transformations for general-sum stochastic games. Journal of Artificial Intelligence Research, 41, 397–406. https://doi.org/10.1613/jair.3384
Mendeley helps you to discover research relevant for your work.