We used genetic algorithms to evolve populations of reinforcement learning (Q-learning) agents to play a repeated two-player symmetric coordination game under different risk conditions and found that evolution steered our simulated populations to the Pareto inefficient equilibrium under high-risk conditions and to the Pareto efficient equilibrium under low-risk conditions. Greater degrees of forgiveness and temporal discounting of future returns emerged in populations playing the low-risk game. Results demonstrate the utility of simulation to evolutionary psychology.
CITATION STYLE
Bearden, J. N. (2001). The evolution of inefficiency in a simulated stag hunt. Behavior Research Methods, Instruments, and Computers, 33(2), 124–129. https://doi.org/10.3758/BF03195357
Mendeley helps you to discover research relevant for your work.