Assessing Policy, Loss and Planning Combinations in Reinforcement Learning Using a New Modular Architecture

Tiago Gaspar Oliveira; Arlindo L. Oliveira

Conference Proceedings

Assessing Policy, Loss and Planning Combinations in Reinforcement Learning Using a New Modular Architecture

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13566 LNAI 427-439

DOI: 10.1007/978-3-031-16474-3_35

0Citations

3Readers

Get full text

Abstract

The model-based reinforcement learning paradigm, which uses planning algorithms and neural network models, has recently achieved unprecedented results in diverse applications, leading to what is now known as deep reinforcement learning. These agents are quite complex and involve multiple components, factors that create challenges for research and development of new models. In this work, we propose a new modular software architecture suited for these types of agents, and a set of building blocks that can be easily reused and assembled to construct new model-based reinforcement learning agents. These building blocks include search algorithms, policies, and loss functions (Code available at https://github.com/GaspTO/Modular_MBRL ). We illustrate the use of this architecture by combining several of these building blocks to implement and test agents that are optimized to three different test environments: Cartpole, Minigrid, and Tictactoe. One particular search algorithm, made available in our implementation and not previously used in reinforcement learning, which we called averaged minimax, achieved good results in the three tested environments. Experiments performed with our implementation showed the best combination of search, policy, and loss algorithms to be heavily problem dependent.

Author supplied keywords

Cite

CITATION STYLE

APA

Oliveira, T. G., & Oliveira, A. L. (2022). Assessing Policy, Loss and Planning Combinations in Reinforcement Learning Using a New Modular Architecture. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13566 LNAI, pp. 427–439). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-16474-3_35

Assessing Policy, Loss and Planning Combinations in Reinforcement Learning Using a New Modular Architecture

Abstract

Author supplied keywords

Cite

Register to see more suggestions