Using a priori information for fast learning against non-stationary opponents

Pablo Hernandez-Leal; Enrique Munoz De Cote; L. Enrique Sucar

Journal Article

Using a priori information for fast learning against non-stationary opponents

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8864 536-547

DOI: 10.1007/978-3-319-12027-0_43

7Citations

6Readers

Get full text

Abstract

For an agent to be successful in interacting against many different and unknown types of opponents it should excel at learning fast a model of the opponent and adapt online to non-stationary (changing) strategies. Recent works have tackled this problem by continuously learning models of the opponent while checking for switches in the opponent strategy. However, these approaches fail to use a priori information which can be useful for a faster detection of the opponent model. Moreover, if an opponent uses only a finite set of strategies, then maintaining a list of those strategies would also provide benefits for future interactions, in case of opponents who return to previous strategies (such as periodic opponents). Our contribution is twofold, first, we propose an algorithm that can use a priori information, in the form of a set of models, in order to promote a faster detection of the opponent model. The second is an algorithm that while learning new models keeps a record of them in case the opponent reuses one of those. Our approach outperforms the state of the art algorithms in the field (in terms of model quality and cumulative rewards) in the domain of the iterated prisoner’s dilemma against a non-stationary opponent that switches among different strategies.

Cite

CITATION STYLE

APA

Hernandez-Leal, P., De Cote, E. M., & Sucar, L. E. (2014). Using a priori information for fast learning against non-stationary opponents. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8864, 536–547. https://doi.org/10.1007/978-3-319-12027-0_43

Using a priori information for fast learning against non-stationary opponents

Abstract

Cite

Register to see more suggestions