Spike-based Decision Learning of Nash Equilibria in Two-Player Games

5Citations
Citations of this article
63Readers
Mendeley users who have this article in their library.

Abstract

Humans and animals face decision tasks in an uncertain multi-agent environment where an agent's strategy may change in time due to the co-adaptation of others strategies. The neuronal substrate and the computational algorithms underlying such adaptive decision making, however, is largely unknown. We propose a population coding model of spiking neurons with a policy gradient procedure that successfully acquires optimal strategies for classical game-theoretical tasks. The suggested population reinforcement learning reproduces data from human behavioral experiments for the blackjack and the inspector game. It performs optimally according to a pure (deterministic) and mixed (stochastic) Nash equilibrium, respectively. In contrast, temporal-difference(TD)-learning, covariance-learning, and basic reinforcement learning fail to perform optimally for the stochastic strategy. Spike-based population reinforcement learning, shown to follow the stochastic reward gradient, is therefore a viable candidate to explain automated decision learning of a Nash equilibrium in two-player games. © 2012 Friedrich, Senn.

Cite

CITATION STYLE

APA

Friedrich, J., & Senn, W. (2012). Spike-based Decision Learning of Nash Equilibria in Two-Player Games. PLoS Computational Biology, 8(9). https://doi.org/10.1371/journal.pcbi.1002691

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free