Approximating optimal Dudo play with fixed-strategy iteration counterfactual regret minimization

2Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Using the bluffing dice game Dudo as a challenge domain, we abstract information sets by an imperfect recall of actions. Even with such abstraction, the standard Counterfactual Regret Minimization (CFR) algorithm proves impractical for Dudo, since the number of recursive visits to the same abstracted information sets increase exponentially with the depth of the game graph. By holding strategies fixed across each training iteration, we show how CFR training iterations may be transformed from an exponential-time recursive algorithm into a polynomial-time dynamic-programming algorithm, making computation of an approximate Nash equilibrium for the full 2-player game of Dudo possible for the first time. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Neller, T. W., & Hnath, S. (2012). Approximating optimal Dudo play with fixed-strategy iteration counterfactual regret minimization. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7168 LNCS, pp. 170–183). https://doi.org/10.1007/978-3-642-31866-5_15

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free