On learning soccer strategies

Rafał Sałustowicz; Marco Wiering; Jürgen Schmidhuber

Conference Proceedings

On learning soccer strategies

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (1997) 1327 769-774

DOI: 10.1007/bfb0020247

2Citations

43Readers

Get full text

Abstract

We use simulated soccer to study multiagent learning. Each team's players (agents) share action set and policy but may behave differently due to position-dependent inputs. AU agents making up a team are rewarded or punished collectively in case of goals. We conduct simulations with varying team sizes, and compare two learning algorithms: TD-Q learning with linear neural networks (TD-Q) and Probabilistic Incremental Program Evolution (PIPE). TD-Q is based on evaluation functions (EFs) mapping input/action pairs to expected reward, while PIPE searches policy space directly. PIPE uses an adaptive probability distribution to synthesize programs that calculate action probabilities from current inputs. Our results show that TD-Q has difficulties to learn appropriate shared EFs. PIPE, however, does not depend on EFs and finds good policies faster and more reliably.

Cite

CITATION STYLE

APA

Sałustowicz, R., Wiering, M., & Schmidhuber, J. (1997). On learning soccer strategies. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1327, pp. 769–774). Springer Verlag. https://doi.org/10.1007/bfb0020247

On learning soccer strategies

Abstract

Cite

Register to see more suggestions