MDPs as Distribution Transformers: Affine Invariant Synthesis for Safety Objectives

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Markov decision processes can be viewed as transformers of probability distributions. While this view is useful from a practical standpoint to reason about trajectories of distributions, basic reachability and safety problems are known to be computationally intractable (i.e., Skolem-hard) to solve in such models. Further, we show that even for simple examples of MDPs, strategies for safety objectives over distributions can require infinite memory and randomization. In light of this, we present a novel overapproximation approach to synthesize strategies in an MDP, such that a safety objective over the distributions is met. More precisely, we develop a new framework for template-based synthesis of certificates as affine distributional and inductive invariants for safety objectives in MDPs. We provide two algorithms within this framework. One can only synthesize memoryless strategies, but has relative completeness guarantees, while the other can synthesize general strategies. The runtime complexity of both algorithms is in PSPACE. We implement these algorithms and show that they can solve several non-trivial examples.

Cite

CITATION STYLE

APA

Akshay, S., Chatterjee, K., Meggendorfer, T., & Žikelić, Đ. (2023). MDPs as Distribution Transformers: Affine Invariant Synthesis for Safety Objectives. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13966 LNCS, pp. 86–112). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-37709-9_5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free