Querying to find a safe policy under uncertain safety constraints in Markov decision processes

5Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

An autonomous agent acting on behalf of a human user has the potential of causing side-effects that surprise the user in unsafe ways. When the agent cannot formulate a policy with only side-effects it knows are safe, it needs to selectively query the user about whether other useful side-effects are safe. Our goal is an algorithm that queries about as few potential side-effects as possible to find a safe policy, or to prove that none exists. We extend prior work on irreducible infeasible sets to also handle our problem’s complication that a constraint to avoid a side-effect cannot be relaxed without user permission. By proving that our objectives are also adaptive submodular, we devise a querying algorithm that we empirically show finds nearly-optimal queries with much less computation than a guaranteed-optimal approach, and outperforms competing approximate approaches.

Cite

CITATION STYLE

APA

Zhang, S., Durfee, E. H., & Singh, S. (2020). Querying to find a safe policy under uncertain safety constraints in Markov decision processes. In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (pp. 2552–2559). AAAI press. https://doi.org/10.1609/aaai.v34i03.5638

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free