Model-based reinforcement learning algorithms are typically more sample efficient than their model-free counterparts, especially in sparse reward problems. Unfortunately, many interesting domains are too complex to specify complete models, and learning a model takes a large number of environment samples. If we could specify an incomplete model and allow the agent to learn how best to use it, we could take advantage of our partial understanding of many domains. In this work we propose SAGE, an algorithm combining learning and planning to exploit a previously unusable class of incomplete models. This combines the strengths of symbolic planning and neural learning approaches in a novel way that outperforms competing methods on variations of taxi world and Minecraft.
CITATION STYLE
Chester, A., Dann, M., Zambetta, F., & Thangarajah, J. (2024). SAGE: Generating Symbolic Goals for Myopic Models in Deep Reinforcement Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14472 LNAI, pp. 274–285). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-99-8391-9_22
Mendeley helps you to discover research relevant for your work.