Ensemble methods for reinforcement learning with function approximation

19Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Ensemble methods allow to combine multiple models to increase the predictive performances but mostly utilize labelled data. In this paper we propose several ensemble methods to learn a combined parameterized state-value function of multiple agents. For this purpose the Temporal-Difference (TD) and Residual-Gradient (RG) update methods as well as a policy function is adapted to learn from joint decisions. Such joint decisions include Majority Voting and Averaging of the state-values. We apply these ensemble methods to the simple pencil-and-paper game Tic-Tac-Toe and show that an ensemble of three agents outperforms a single agent in terms of the Mean-Squared Error (MSE) to the true values as well as in terms of the resulting policy. Further we apply the same methods to learn the shortest path in a 20 ×20 maze and empirically show that the learning speed is faster and the resulting policy, i.e. the number of correctly choosen actions is better in an ensemble of multiple agents than that of a single agent. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Faußer, S., & Schwenker, F. (2011). Ensemble methods for reinforcement learning with function approximation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6713 LNCS, pp. 56–65). https://doi.org/10.1007/978-3-642-21557-5_8

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free