Decentralized POMDPs

Frans A. Oliehoek

Book Chapter

Decentralized POMDPs

Oliehoek F

Springer Verlag, (2012), 471-503

DOI: 10.1007/978-3-642-27645-3_15

77Citations

89Readers

Get full text

Abstract

This chapter presents an overview of the decentralized POMDP (Dec-POMDP) framework. In a Dec-POMDP, a team of agents collaborates to maximize a global reward based on local information only. This means that agents do not observe a Markovian signal during execution and therefore the agents’ individual policies map fromhistories to actions. Searching for an optimal joint policy is an extremely hard problem: it is NEXP-complete. This suggests, assuming NEXP≠EXP, that any optimal solution method will require doubly exponential time in the worst case. This chapter focuses on planning for Dec-POMDPs over a finite horizon. It covers the forward heuristic search approach to solving Dec-POMDPs, as well as the backward dynamic programming approach. Also, it discusses how these relate to the optimal Q-value function of a Dec-POMDP. Finally, it provides pointers to other solution methods and further related topics.

Author supplied keywords

Cite

CITATION STYLE

APA

Oliehoek, F. A. (2012). Decentralized POMDPs. In Adaptation, Learning, and Optimization (Vol. 12, pp. 471–503). Springer Verlag. https://doi.org/10.1007/978-3-642-27645-3_15

Decentralized POMDPs

Abstract

Author supplied keywords

Cite

Register to see more suggestions