Convergent gradient ascent in general-sum games

6Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In this work we look at the recent results in policy gradient learning in a general-sum game scenario, in the form of two algorithms, IGA and WoLF-IGA. We address the drawbacks in convergence properties of these algorithms, and propose a more accurate version of WoLF-IGA that is guaranteed to converge to Nash Equilibrium policies in self-play (or against an IGA learner). We also present a control theoretic interpretation of variable learning rate which not only justifies WoLF-IGA, but also shows it to achieve fastest convergence under some constraints. Finally we derive optimal learning rates for fastest convergence in practical simulations.

Cite

CITATION STYLE

APA

Banerjee, B., & Peng, J. (2002). Convergent gradient ascent in general-sum games. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2430, pp. 1–9). Springer Verlag. https://doi.org/10.1007/3-540-36755-1_1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free