Scalable initial state interdiction for factored MDPs

1Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We propose a novel Stackelberg game model of MDP interdiction in which the defender modifies the initial state of the planner, who then responds by computing an optimal policy starting with that state. We first develop a novel approach for MDP interdiction in factored state space that allows the defender to modify the initial state. The resulting approach can be computationally expensive for large factored MDPs. To address this, we develop several interdiction algorithms that leverage variations of reinforcement learning using both linear and non-linear function approximation. Finally, we extend the interdiction framework to consider a Bayesian interdiction problem in which the inter-dictor is uncertain about some of the planner's initial state features. Extensive experiments demonstrate the effectiveness of our approaches.

Cite

CITATION STYLE

APA

Panda, S., & Vorobeychik, Y. (2018). Scalable initial state interdiction for factored MDPs. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2018-July, pp. 4801–4807). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2018/667

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free