Exponential moving averaged q-network for DDPG

Xiangxiang Shen; Chuanhuan Yin; Yekun Chai; Xinwen Hou

Conference Proceedings

Exponential moving averaged q-network for DDPG

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11857 LNCS 562-572

DOI: 10.1007/978-3-030-31654-9_48

2Citations

6Readers

Get full text

Abstract

The characteristics of instability and high-variance of Qnetwork result in the overestimation bias in the Deep Q-network (DQN). In contrast, Double DQN and averaged DQN have mitigated this problem from different approaches. By improving on the Averaged-DQN, we theoretically prove that the effective variance reduction of the Exponential Moving Average (EMA) in DQN, further illustrating its efficiency in the target network of Deep Deterministic Policy Gradient. We further propose the A3QDDPG algorithm by introducing the EMA Q-Network which is independent of the target Q-network when updating the policy. Experiments on ten continuous control environments of MuJoCo show that A3QDDPG achieves better performance than DDPG in terms of the average return, and the overestimation phenomenon of DDPG can also be observed under some environment in terms of average Q value.

Author supplied keywords

Cite

CITATION STYLE

APA

Shen, X., Yin, C., Chai, Y., & Hou, X. (2019). Exponential moving averaged q-network for DDPG. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11857 LNCS, pp. 562–572). Springer. https://doi.org/10.1007/978-3-030-31654-9_48

Exponential moving averaged q-network for DDPG

Abstract

Author supplied keywords

Cite

Register to see more suggestions