Sharing information in adversarial bandit

David L. St-Pierre; Olivier Teytaud

Conference Proceedings

Sharing information in adversarial bandit

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8602 386-398

DOI: 10.1007/978-3-662-45523-4_32

0Citations

3Readers

Get full text

Abstract

2-Player games in general provide a popular platform for research in Artificial Intelligence (AI). One of the main challenges coming from this platform is approximating a Nash Equilibrium (NE) over zero-sum matrix games. While the problem of computing such a Nash Equilibrium is solvable in polynomial time using Linear Programming (LP), it rapidly becomes infeasible to solve as the size of the matrix grows; a situation commonly encountered in games. This paper focuses on improving the approximation of a NE for matrix games such that it outperforms the state-of-the-art algorithms given a finite (and rather small) number T of oracle requests to rewards. To reach this objective, we propose to share information between the different relevant pure strategies. We show both theoretically by improving the bound and empirically by experiments on artificial matrices and on a real-world game that information sharing leads to an improvement of the approximation of the NE.

Author supplied keywords

Cite

CITATION STYLE

APA

St-Pierre, D. L., & Teytaud, O. (2014). Sharing information in adversarial bandit. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8602, pp. 386–398). Springer Verlag. https://doi.org/10.1007/978-3-662-45523-4_32

Sharing information in adversarial bandit

Abstract

Author supplied keywords

Cite

Register to see more suggestions