Combining policy search with planning in multi-agent cooperation

2Citations
Citations of this article
26Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

It is cooperation that essentially differentiates multi-agent systems (MASs) from single-agent intelligence. In realistic MAS applications such as RoboCup, repeated work has shown that traditional machine learning (ML) approaches have difficulty mapping directly from cooperative behaviours to actuator outputs. To overcome this problem, vertical layered architectures are commonly used to break cooperation down into behavioural layers; ML has then been used to generate different low-level skills, and a planning mechanism added to create high-level cooperation. We propose a novel method called Policy Search Planning (PSP), in which Policy Search is used to find an optimal policy for selecting plans from a plan pool. PSP extends an existing gradient-search method (GPOMDP) to a MAS domain. We demonstrate how PSP can be used in RoboCup Simulation, and our experimental results reveal robustness, adaptivity, and outperformance over other methods. © 2009 Springer Berlin Heidelberg.

Cite

CITATION STYLE

APA

Ma, J., & Cameron, S. (2009). Combining policy search with planning in multi-agent cooperation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5399 LNAI, pp. 532–543). https://doi.org/10.1007/978-3-642-02921-9_46

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free