Abstract
HEXQ is a reinforcement learning algorithm that decomposes a problem into subtasks and constructs a hierarchy using state variables. The maximum number of levels is constrained by the number of variables representing a state. In HEXQ, values learned for a subtask can be reused in different contexts if the subtasks are identical. If not, values for non-identical subtasks need to be trained separately. This paper introduces a method that tackles these two restrictions. Experimental results show that this method can save the training time dramatically. © Springer-Verlag Berlin Heidelberg 2005.
Cite
CITATION STYLE
Poulton, G., Guo, Y., & Lu, W. (2005). Finding hidden hierarchy in reinforcement learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3683 LNAI, pp. 554–561). Springer Verlag. https://doi.org/10.1007/11553939_79
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.