Abstract
A wide range of discrete planning problems can be solved optimally using graph search algorithms. However, optimal search quickly becomes infeasible with increased complexity of a problem. In such a case, heuristics that guide the planning process towards the goal state can increase performance considerably. Unfortunately, heuristics are often unavailable or need manual and time-consuming engineering. Building upon recent results on applying deep learning to learn generalized reactive policies, we propose to learn heuristics by imitation learning. After learning heuristics based on optimal examples, they are used to guide a classical search algorithm to solve unseen tasks. However, directly applying learned heuristics in search algorithms such as A* breaks optimality guarantees, since learned heuristics are not necessarily admissible. Therefore, we (i) propose a novel method that utilizes learned heuristics to guide Focal Search A*, a variant of A* with guarantees on bounded suboptimality; (ii) compare the complexity and performance of jointly learning individual policies for multiple robots with an approach that learns one policy for all robots; (iii) thoroughly examine how learned policies generalize to previously unseen environments and demonstrate considerably improved performance in a simulated complex dynamic coverage problem.
Cite
CITATION STYLE
Spies, M., Todescato, M., Becker, H., Kesper, P., Waniek, N., & Guo, M. (2019). Bounded suboptimal search with learned heuristics for multi-agent systems. In 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019 (pp. 2387–2394). AAAI Press. https://doi.org/10.1609/aaai.v33i01.33012387
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.