Back-propagation as reinforcement in prediction tasks

André Grüning

Conference Proceedings

Back-propagation as reinforcement in prediction tasks

Grüning A

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2005) 3697 LNCS 547-552

DOI: 10.1007/11550907_86

1Citations

7Readers

Get full text

Abstract

The back-propagation (BP) training scheme is widely used for training network models in cognitive science besides its well known technical and biological short-comings. In this paper we contribute to making the BP training scheme more acceptable from a biological point of view in cognitively motivated prediction tasks overcoming one of its major drawbacks. Traditionally, recurrent neural networks in symbolic time series prediction (e. g. language) are trained with gradient decent based learning algorithms, notably with back-propagation (BP) through time. A major drawback for the biological plausibility of BP is that it is a supervised scheme in which a teacher has to provide a fully specified target answer. Yet, agents in natural environments often receive a summary feed-back about the degree of success or failure only, a view adopted in reinforcement learning schemes. In this work we show that for simple recurrent networks in prediction tasks for which there is a probability interpretation of the network's output vector, Elman BP can be reimplemented as a reinforcement learning scheme for which the expected weight updates agree with the ones from traditional Elman BP, using ideas from the AGREL learning scheme (van Ooyen and Roelfsema 2003) for feed-forward networks. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Grüning, A. (2005). Back-propagation as reinforcement in prediction tasks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3697 LNCS, pp. 547–552). https://doi.org/10.1007/11550907_86

Back-propagation as reinforcement in prediction tasks

Abstract

Cite

Register to see more suggestions