Dynamic thresholding and pruning for regret minimization

Noam Brown; Christian Kroer; Tuomas Sandholm

Conference ProceedingsOPEN ACCESS

Dynamic thresholding and pruning for regret minimization

31st AAAI Conference on Artificial Intelligence, AAAI 2017 (2017) 421-429

DOI: 10.1609/aaai.v31i1.10603

36Citations

55Readers

Abstract

Regret minimization is widely used in determining strategies for imperfect-information games and in online learning. In large games, computing the regrets associated with a single iteration can be slow. For this reason, pruning-in which parts of the decision tree are not traversed in every iteration-has emerged as an essential method for speeding up iterations in large games. The ability to prune is a primary reason why the Counterfactual Regret Minimization (CFR) algorithm using regret matching has emerged as the most popular iterative algorithm for imperfect-information games, despite its relatively poor convergence bound. In this paper, we introduce dynamic thresholding, in which a threshold is set at every iteration such that any action in the decision tree with probability below the threshold is set to zero probability. This enables pruning for the first time in a wide range of algorithms. We prove that dynamic thresholding can be applied to Hedge while increasing its convergence bound by only a constant factor in terms of number of iterations. Experiments demonstrate a substantial improvement in performance for Hedge as well as the excessive gap technique.

Cite

CITATION STYLE

APA

Brown, N., Kroer, C., & Sandholm, T. (2017). Dynamic thresholding and pruning for regret minimization. In 31st AAAI Conference on Artificial Intelligence, AAAI 2017 (pp. 421–429). AAAI press. https://doi.org/10.1609/aaai.v31i1.10603

Dynamic thresholding and pruning for regret minimization

Abstract

Cite

Register to see more suggestions