A spiking network model of decision making employing rewarded STDP

18Citations
Citations of this article
52Readers
Mendeley users who have this article in their library.

Abstract

Reward-modulated spike timing dependent plasticity (STDP) combines unsupervised STDP with a reinforcement signal that modulates synaptic changes. It was proposed as a learning rule capable of solving the distal reward problem in reinforcement learning. Nonetheless, performance and limitations of this learning mechanism have yet to be tested for its ability to solve biological problems. In our work, rewarded STDP was implemented to model foraging behavior in a simulated environment. Over the course of training the network of spiking neurons developed the capability of producing highly successful decision-making. The network performance remained stable even after significant perturbations of synaptic structure. Rewarded STDP alone was insufficient to learn effective decision making due to the difficulty maintaining homeostatic equilibrium of synaptic weights and the development of local performance maxima. Our study predicts that successful learning requires stabilizing mechanisms that allow neurons to balance their input and output synapses as well as synaptic noise. © 2014 Skorheim et al.

Cite

CITATION STYLE

APA

Skorheim, S., Lonjers, P., & Bazhenov, M. (2014). A spiking network model of decision making employing rewarded STDP. PLoS ONE, 9(3). https://doi.org/10.1371/journal.pone.0090821

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free