Dynamics of the Bush-Mosteller Learning Algorithm in 2x2 Games

  • R. L
  • S. S
N/ACitations
Citations of this article
52Readers
Mendeley users who have this article in their library.

Abstract

Reinforcement learners interact with their environment and use their experience to choose or avoid certain actions based on the observed consequences. Actions that led to satisfactory outcomes (i.e. outcomes that met or exceeded aspirations) in the past tend to be repeated in the future, whereas choices that led to unsatisfactory experiences are avoided. The empirical study of reinforcement learning dates back to Thorndike's animal experiments on instrumental learning at the end of the 19th century (Thorndike, 1898). The results of these experiments were formalised in the well known `Law of Effect', which is nowadays one of the most robust properties of learning in the experimental psychology literature:``Of several responses made to the same situation those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur; those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections to the situation weakened, so that, when it recurs, they will be less likely to occur. The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond.'' (Thorndike, 1911, p. 244)Nowadays there is little doubt that reinforcement learning is an important aspect of much learning in most animal species, including many phylogenetically very distant from vertebrates (e.g. earthworms (Maier & Schneirla, 1964) and fruit flies (Wustmann, 1996)). Thus, it is not surprising that reinforcement learning --being one of the most widespread adaptation mechanisms in nature-- has attracted the attention of many scientists and engineers for decades. This interest has led to the formulation of various models of reinforcement learning and --when feasible-- to the theoretical analysis of their dynamics. In particular, this chapter characterises the dynamics of one of the best known stochastic models of reinforcement learning (Bush & Mosteller, 1955) when applied to decision problems of strategy (i.e. games).

Cite

CITATION STYLE

APA

R., L., & S., S. (2008). Dynamics of the Bush-Mosteller Learning Algorithm in 2x2 Games. In Reinforcement Learning. I-Tech Education and Publishing. https://doi.org/10.5772/5282

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free