Tentative exploration on reinforcement learning algorithms for stochastic rewards

Luis Peña; Antonio Latorre; José María Peña; Sascha Ossowski

Conference Proceedings

Tentative exploration on reinforcement learning algorithms for stochastic rewards

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5572 LNAI 336-343

DOI: 10.1007/978-3-642-02319-4_40

2Citations

3Readers

Get full text

Abstract

This paper addresses a way to generate mixed strategies using reinforcement learning algorithms in domains with stochastic rewards. A new algorithm, based on Q-learning model, called TERSQ is introduced. As a difference from other approaches for stochastic scenarios, TERSQ uses a global exploration rate for all the state/actions in the same run. This exploration rate is selected at the beginning of each round, using a probabilistic distribution, which is updated once the run is finished. In this paper we compare TERSQ with similar approaches that use probability distributions depending on state-action pairs. Two experimental scenarios have been considered. First one deals with the problem of learning the optimal way to combine several evolutionary algorithms used simultaneously by a hybrid approach. In the second one, the objective is to learn the best strategy for a set of competing agents in combat-based videogame. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Peña, L., Latorre, A., Peña, J. M., & Ossowski, S. (2009). Tentative exploration on reinforcement learning algorithms for stochastic rewards. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5572 LNAI, pp. 336–343). https://doi.org/10.1007/978-3-642-02319-4_40

Tentative exploration on reinforcement learning algorithms for stochastic rewards

Abstract

Cite

Register to see more suggestions