Hypervolume-based multi-objective reinforcement learning

47Citations
Citations of this article
52Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Indicator-based evolutionary algorithms are amongst the best performing methods for solving multi-objective optimization (MOO) problems. In reinforcement learning (RL), introducing a quality indicator in an algorithm's decision logic was not attempted before. In this paper, we propose a novel on-line multi-objective reinforcement learning (MORL) algorithm that uses the hypervolume indicator as an action selection strategy. We call this algorithm the hypervolume-based MORL algorithm or HB-MORL and conduct an empirical study of the performance of the algorithm using multiple quality assessment metrics from multi-objective optimization. We compare the hypervolume-based learning algorithm on different environments to two multi-objective algorithms that rely on scalarization techniques, such as the linear scalarization and the weighted Chebyshev function. We conclude that HB-MORL significantly outperforms the linear scalarization method and performs similarly to the Chebyshev algorithm without requiring any user-specified emphasis on particular objectives. © 2013 Springer-Verlag.

Cite

CITATION STYLE

APA

Van Moffaert, K., Drugan, M. M., & Nowé, A. (2013). Hypervolume-based multi-objective reinforcement learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7811 LNCS, pp. 352–366). https://doi.org/10.1007/978-3-642-37140-0_28

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free