Application of deep learning to NP-hard combinatorial optimization problems is an emerging research trend, and a number of interesting approaches have been published over the last few years. In this work we address robust optimization, which is a more complex variant where a max-min problem is to be solved. We obtain robust solutions by solving the inner minimization problem exactly and apply Reinforcement Learning to learn a heuristic for the outer problem. The minimization term in the inner objective represents an obstacle to existing RL-based approaches, as its value depends on the full solution in a non-linear manner and cannot be evaluated for partial solutions constructed by the agent over the course of each episode. We overcome this obstacle by defining the reward in terms of the one-step advantage over a baseline policy whose role can be played by any fast heuristic for the given problem. The agent is trained to maximize the total advantage, which, as we show, is equivalent to the original objective. We validate our approach by solving min-max versions of standard benchmarks for the Capacitated Vehicle Routing and the Traveling Salesperson Problem, where our agents obtain near-optimal solutions and improve upon the baselines.
CITATION STYLE
Jacobs, T., Alesiani, F., & Ermis, G. (2021). Reinforcement Learning for Route Optimization with Robustness Guarantees. In IJCAI International Joint Conference on Artificial Intelligence (pp. 2592–2598). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2021/357
Mendeley helps you to discover research relevant for your work.