Approximating optimal Dudo play with fixed-strategy iteration counterfactual regret minimization

Todd W. Neller; Steven Hnath

Conference Proceedings

Approximating optimal Dudo play with fixed-strategy iteration counterfactual regret minimization

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7168 LNCS 170-183

DOI: 10.1007/978-3-642-31866-5_15

2Citations

8Readers

Get full text

Abstract

Using the bluffing dice game Dudo as a challenge domain, we abstract information sets by an imperfect recall of actions. Even with such abstraction, the standard Counterfactual Regret Minimization (CFR) algorithm proves impractical for Dudo, since the number of recursive visits to the same abstracted information sets increase exponentially with the depth of the game graph. By holding strategies fixed across each training iteration, we show how CFR training iterations may be transformed from an exponential-time recursive algorithm into a polynomial-time dynamic-programming algorithm, making computation of an approximate Nash equilibrium for the full 2-player game of Dudo possible for the first time. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Neller, T. W., & Hnath, S. (2012). Approximating optimal Dudo play with fixed-strategy iteration counterfactual regret minimization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7168 LNCS, pp. 170–183). https://doi.org/10.1007/978-3-642-31866-5_15

Approximating optimal Dudo play with fixed-strategy iteration counterfactual regret minimization

Abstract

Cite

Register to see more suggestions