HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning Problem

Yun Hua; Xiangfeng Wang; Bo Jin; Wenhao Li; Junchi Yan; Xiaofeng He; Hongyuan Zha

Conference ProceedingsOPEN ACCESS

HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning Problem

Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2021) 637-645

DOI: 10.1145/3447548.3467242

7Citations

5Readers

Get full text

Abstract

In spite of the success of existing meta reinforcement learning methods, they still have difficulty in learning a meta policy effectively for RL problems with sparse reward. In this respect, we develop a novel meta reinforcement learning framework called Hyper-Meta RL(HMRL), for sparse reward RL problems. It is consisted with three modules including the cross-environment meta state embedding module which constructs a common meta state space to adapt to different environments; the meta state based environment-specific meta reward shaping which effectively extends the original sparse reward trajectory by cross-environmental knowledge complementarity and as a consequence the meta policy achieves better generalization and efficiency with the shaped meta reward. Experiments with sparse-reward environments show the superiority of HMRL on both transferability and policy learning efficiency.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Hua, Y., Wang, X., Jin, B., Li, W., Yan, J., He, X., & Zha, H. (2021). HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning Problem. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 637–645). Association for Computing Machinery. https://doi.org/10.1145/3447548.3467242

Readers' Seniority

PhD / Post grad / Masters / Doc 2

67%

Researcher 1

33%

Readers' Discipline

Computer Science 3

100%

HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning Problem

Abstract

Author supplied keywords

References Powered by Scopus

Human-level control through deep reinforcement learning

Mastering the game of Go with deep neural networks and tree search

IntelliLight: A reinforcement learning approach for intelligent traffic light control

Cited by Powered by Scopus

Reinforcement learning for predictive maintenance: a systematic technical review

Auto uning of price prediction models for high-frequency trading via reinforcement learning

Meta-Scheduling Framework With Cooperative Learning Toward Beyond 5G

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline