Fitness landscape features and reward shaping in reinforcement learning policy spaces

2Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Reinforcement learning (RL) algorithms have received a lot of attention in recent years. However, relatively little work has been dedicated to analysing RL problems; which are thought to contain unique challenges, such as sparsity of the reward signal. Reward shaping is one approach that may help alleviate the sparse reward problem. In this paper we use fitness landscape features to study how reward shaping affects the underlying optimisation landscape of RL problems. Our results indicate that features such as deception, ruggedness, searchability, and symmetry can all be greatly affected by reward shaping; while neutrality, dispersion, and the number of local optima remain relatively invariant. This may provide some guidance as to the potential effectiveness of reward shaping for different algorithms, depending on what features they are sensitive to. Additionally, all of the reward functions we studied produced policy landscapes that contain a single local optimum and very high neutrality. This suggests that algorithms that explore spaces globally, rather than locally, may perform well on RL problems; and may help explain the success of evolutionary methods on RL problems. Furthermore, we suspect that the high neutrality of these landscapes is connected to the issue of reward sparsity in RL.

Cite

CITATION STYLE

APA

du Preez-Wilkinson, N., & Gallagher, M. (2020). Fitness landscape features and reward shaping in reinforcement learning policy spaces. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12270 LNCS, pp. 500–514). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-58115-2_35

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free