Solving games with functional regret estimation

Kevin Waugh; Dustin Morrill; J. Andrew Bagnell; Michael Bowling

Conference ProceedingsOPEN ACCESS

Solving games with functional regret estimation

AAAI Workshop - Technical Report (2015) WS-15-07 63-69

DOI: 10.1609/aaai.v29i1.9445

8Citations

54Readers

Abstract

We propose a novel online learning method for minimizing regret in large extensive-form games. The approach learns a function approximator online to estimate the regret for choosing a particular action. A no- regret algorithm uses these estimates in place of the true regrets to define a sequence of policies. We prove the approach sound by providing a bound relating the quality of the function approximation and regret of the algorithm. A corollary being that the method is guaranteed to converge to a Nash equilibrium in self- play so long as the regrets are ultimately realizable by the function approximator. Our technique can be understood as a principled generalization of existing work on abstraction in large games; in our work, both the abstraction as well as the equilibrium are learned during self-play. We demonstrate empirically the method achieves higher quality strategies than state-of-the-art abstraction techniques given the same resources.

Cite

CITATION STYLE

APA

Waugh, K., Morrill, D., Andrew Bagnell, J., & Bowling, M. (2015). Solving games with functional regret estimation. In AAAI Workshop - Technical Report (Vol. WS-15-07, pp. 63–69). AI Access Foundation. https://doi.org/10.1609/aaai.v29i1.9445

Solving games with functional regret estimation

Abstract

Cite

Register to see more suggestions