XCS constitutes the most deeply investigated classifier system today. It offers strong potentials and comes with inherent capabilities for mastering a variety of different learning tasks. Besides outstanding successes in various classification and regression tasks, XCS also proved very effective in certain multi-step environments from the domain of reinforcement learning. Especially in the latter domain, recent advances have been mainly driven by algorithms which model their policies based on deep neural networks, among which the Deep-Q-Network (DQN) being a prominent representative. Experience Replay (ER) constitutes one of the crucial factors for the DQN's successes, since it facilitates stabilized training of the neural network-based Q-function approximators. Surprisingly, XCS barely takes advantage of similar mechanisms that leverage remembered raw experiences. To bridge this gap, this paper investigates the benefits of extending XCS with ER. We demonstrate that for single-step tasks ER yields strong improvements in terms of sample efficiency. On the downside, however, we reveal that ER might further aggravate well-studied issues not yet solved for XCS when applied to sequential decision problems demanding for long-action-chains.
CITATION STYLE
Stein, A., Maier, R., Rosenbauer, L., & Hahner, J. (2020). XCS classifier system with experience replay. In GECCO 2020 - Proceedings of the 2020 Genetic and Evolutionary Computation Conference (pp. 404–413). Association for Computing Machinery. https://doi.org/10.1145/3377930.3390249
Mendeley helps you to discover research relevant for your work.