Epoch-incremental queue-dyna algorithm

Roman Zajdel

Conference Proceedings

Epoch-incremental queue-dyna algorithm

Zajdel R

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 5097 LNAI 1160-1170

DOI: 10.1007/978-3-540-69731-2_109

2Citations

2Readers

Get full text

Abstract

The basic reinforcement learning algorithm, as Q-learning, is characterized by short time-consuming single learning step, however, the number of epochs necessary to achieve the optimal policy is not satisfactory. There are many methods that reduce the number of necessary epochs, like TD(λ>∈0) , Dyna or prioritized sweeping, but their learning time is considerable. This paper proposes a combination of Q-learning algorithm performed in incremental mode with executed in epoch mode method of acceleration based on environment model and distance to terminal state. This approach ensures the maintenance of short time of a single learning step and high efficiency comparable with Dyna or prioritized sweeping. Proposed algorithm is compared with Q(λ)-learning, Dyna-Q and prioritized sweeping in the experiments on three maze tasks. The time-consuming learning process and number of epochs necessary to reach the terminal state is used to evaluate the efficiency of compared algorithms. © 2008 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Zajdel, R. (2008). Epoch-incremental queue-dyna algorithm. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5097 LNAI, pp. 1160–1170). https://doi.org/10.1007/978-3-540-69731-2_109

Epoch-incremental queue-dyna algorithm

Abstract

Author supplied keywords

Cite

Register to see more suggestions