Instrumental vigour in punishment and reward

  • Dayan P
  • 133


    Mendeley users who have this article in their library.
  • 33


    Citations of this article.


Recent notions about the vigour of responding in operant conditioning suggest that the long-run average rate of reward should control the alacrity of action in cases in which the actual cost of speed is balanced against the opportunity cost of sloth. The average reward rate is suggested as being reported by tonic activity in the dopamine system and thereby influencing all actions, including ones that do not themselves lead directly to the rewards. This idea is syntactically problematical for the case of punishment. Here, we broaden the scope of the original suggestion, providing a two-factor analysis of obviated punishment in a variety of operant circumstances. We also consider the effects of stochastically successful actions, which turn out to differ rather markedly between appetitive and aversive cases. Finally, we study how to fit these ideas into nascent treatments that extend concepts of opponency between dopamine and serotonin from valence to invigoration.

Author-supplied keywords

  • Dopamine
  • Reinforcement learning
  • Safety
  • Serotonin
  • Two-factor theory

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document


  • Peter Dayan

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free