Metaheuristics-based Exploration Strategies for Multi-Objective Reinforcement Learning

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The fields of Reinforcement Learning (RL) and Optimization aim at finding an optimal solution to a problem, characterized by an objective function. The exploration-exploitation dilemma (EED) is a well known subject in those fields. Indeed, a consequent amount of literature has already been proposed on the subject and shown it is a non-negligible topic to consider to achieve good performances. Yet, many problems in real life involve the optimization of multiple objectives. Multi-Policy Multi-Objective Reinforcement Learning (MPMORL) offers a way to learn various optimised behaviours for the agent in such problems. This work introduces a modular framework for the learning phase of such algorithms, allowing to ease the study of the EED in Inner-Loop MPMORL algorithms. We present three new exploration strategies inspired from the metaheuristics domain. To assess the performance of our methods on various environments, we use a classical benchmark-the Deep Sea Treasure (DST)-as well as propose a harder version of it. Our experiments show all of the proposed strategies outperform the current state-of-the-art ε-greedy based methods on the studied benchmarks.

Cite

CITATION STYLE

APA

Felten, F., Danoy, G., Talbi, E. G., & Bouvry, P. (2022). Metaheuristics-based Exploration Strategies for Multi-Objective Reinforcement Learning. In International Conference on Agents and Artificial Intelligence (Vol. 2, pp. 662–673). Science and Technology Publications, Lda. https://doi.org/10.5220/0010989100003116

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free