Feedback of delayed rewards in XCS for environments with aliasing states

Kuang Yuan Chen; Peter A. Lindsay

Conference Proceedings

Feedback of delayed rewards in XCS for environments with aliasing states

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5865 LNAI 252-261

DOI: 10.1007/978-3-642-10427-5_25

0Citations

1Readers

Get full text

Abstract

Wilson [13] showed how delayed reward feedback can be used to solve many multi-step problems for the widely used XCS learning classifier system. However, Wilson's method - based on back-propagation with discounting from Q-learning - runs into difficulties in environments with aliasing states, since the local reward function often does not converge. This paper describes a different approach to reward feedback, in which a layered reward scheme for XCS classifiers is learnt during training. We show that, with a relatively minor modification to XCS feedback, the approach not only solves problems such as Woods1 but can also solve aliasing states problems such as Littman57, MiyazakiA and MazeB. © Springer-Verlag Berlin Heidelberg 2009.

Author supplied keywords

Cite

CITATION STYLE

APA

Chen, K. Y., & Lindsay, P. A. (2009). Feedback of delayed rewards in XCS for environments with aliasing states. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5865 LNAI, pp. 252–261). https://doi.org/10.1007/978-3-642-10427-5_25

Feedback of delayed rewards in XCS for environments with aliasing states

Abstract

Author supplied keywords

Cite

Register to see more suggestions