Learning biped locomotion based on Q-learning and neural networks

Peng Ziqiang; Pan Gang; Yu Ling

Conference Proceedings

Learning biped locomotion based on Q-learning and neural networks

Lecture Notes in Electrical Engineering (2011) 122 LNEE 313-321

DOI: 10.1007/978-3-642-25553-3_39

0Citations

3Readers

Get full text

Abstract

Robot postures are transformed continuously until an impact occurs. In order to solve the continuous state problem, a Q-Learning controller based on Back-Propagation (BP) Neural Networks is designed. Instead of Q table, a Multi-input Multi-output BP Neural Network is employed to compute Q value for continuous state. Eligibility trace is used to solve time reliability problem in Q-Learning, and we integrate the eligibility trace algorithm to the gradient descent method for continuous state. To avoid dimension explosion, an inverted pendulum pose-energy model is built to reduce the dimension of the input state space. For the sake of balance between "explore" and "exploit" of Q-Learning, we use a new ε-greedy method with a variable stochastic probability, which decreases with the increasing of the step number. Simulation results indicate that the proposed method is effective. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Ziqiang, P., Gang, P., & Ling, Y. (2011). Learning biped locomotion based on Q-learning and neural networks. In Lecture Notes in Electrical Engineering (Vol. 122 LNEE, pp. 313–321). https://doi.org/10.1007/978-3-642-25553-3_39

Learning biped locomotion based on Q-learning and neural networks

Abstract

Author supplied keywords

Cite

Register to see more suggestions