Learning biped locomotion based on Q-learning and neural networks

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Robot postures are transformed continuously until an impact occurs. In order to solve the continuous state problem, a Q-Learning controller based on Back-Propagation (BP) Neural Networks is designed. Instead of Q table, a Multi-input Multi-output BP Neural Network is employed to compute Q value for continuous state. Eligibility trace is used to solve time reliability problem in Q-Learning, and we integrate the eligibility trace algorithm to the gradient descent method for continuous state. To avoid dimension explosion, an inverted pendulum pose-energy model is built to reduce the dimension of the input state space. For the sake of balance between "explore" and "exploit" of Q-Learning, we use a new ε-greedy method with a variable stochastic probability, which decreases with the increasing of the step number. Simulation results indicate that the proposed method is effective. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Ziqiang, P., Gang, P., & Ling, Y. (2011). Learning biped locomotion based on Q-learning and neural networks. In Lecture Notes in Electrical Engineering (Vol. 122 LNEE, pp. 313–321). https://doi.org/10.1007/978-3-642-25553-3_39

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free