Abstract
This paper analyzes, from theoretical and algorithmic perspectives, a class of problems recently introduced in the literature of Markov decision processes-configurable Markov decision processes. In this new class of problems we jointly optimize the probability transition function and associated optimal policy, in order to improve the performance of a decision-making agent. We contribute a complexity analysis on the problem from a computational perspective, where we show that, in general, solving a configurable MDP is NP-Hard. We also discuss practical challenges often faced in solving this class of problems. Additionally, we formally derive a gradient-based approach that sheds some light on the correctness and limitations of existing methods. We conclude by demonstrating the application of different parameterizations of configurable MDPs in two scenarios, offering a discussion on advantages and drawbacks from modeling and algorithmic perspectives. Our contributions set the foundation for a better understanding of this recent problem, and the wider applicability of the underlying ideas to different planning problems.
Cite
CITATION STYLE
Silva, R., Farina, G., Melo, F. S., & Veloso, M. (2019). A theoretical and algorithmic analysis of configurable MDPs. In Proceedings International Conference on Automated Planning and Scheduling, ICAPS (pp. 455–463). Association for the Advancement of Artificial Intelligence. https://doi.org/10.1609/icaps.v29i1.3551
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.