Metaheuristics-based Exploration Strategies for Multi-Objective Reinforcement Learning

Florian Felten; Grégoire Danoy; El Ghazali Talbi; Pascal Bouvry

Conference ProceedingsOPEN ACCESS

Metaheuristics-based Exploration Strategies for Multi-Objective Reinforcement Learning

International Conference on Agents and Artificial Intelligence (2022) 2 662-673

DOI: 10.5220/0010989100003116

1Citations

6Readers

Get full text

Abstract

The fields of Reinforcement Learning (RL) and Optimization aim at finding an optimal solution to a problem, characterized by an objective function. The exploration-exploitation dilemma (EED) is a well known subject in those fields. Indeed, a consequent amount of literature has already been proposed on the subject and shown it is a non-negligible topic to consider to achieve good performances. Yet, many problems in real life involve the optimization of multiple objectives. Multi-Policy Multi-Objective Reinforcement Learning (MPMORL) offers a way to learn various optimised behaviours for the agent in such problems. This work introduces a modular framework for the learning phase of such algorithms, allowing to ease the study of the EED in Inner-Loop MPMORL algorithms. We present three new exploration strategies inspired from the metaheuristics domain. To assess the performance of our methods on various environments, we use a classical benchmark-the Deep Sea Treasure (DST)-as well as propose a harder version of it. Our experiments show all of the proposed strategies outperform the current state-of-the-art ε-greedy based methods on the studied benchmarks.

Author supplied keywords

Cite

CITATION STYLE

APA

Felten, F., Danoy, G., Talbi, E. G., & Bouvry, P. (2022). Metaheuristics-based Exploration Strategies for Multi-Objective Reinforcement Learning. In International Conference on Agents and Artificial Intelligence (Vol. 2, pp. 662–673). Science and Technology Publications, Lda. https://doi.org/10.5220/0010989100003116

Metaheuristics-based Exploration Strategies for Multi-Objective Reinforcement Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions