MDPFuzz: Testing models solving Markov decision processes

Qi Pang; Yuanyuan Yuan; Shuai Wang

Conference ProceedingsOPEN ACCESS

MDPFuzz: Testing models solving Markov decision processes

ISSTA 2022 - Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (2022) 378-390

DOI: 10.1145/3533767.3534388

34Citations

26Readers

Get full text

Abstract

The Markov decision process (MDP) provides a mathematical frame- work for modeling sequential decision-making problems, many of which are crucial to security and safety, such as autonomous driving and robot control. The rapid development of artificial intelligence research has created efficient methods for solving MDPs, such as deep neural networks (DNNs), reinforcement learning (RL), and imitation learning (IL). However, these popular models solving MDPs are neither thoroughly tested nor rigorously reliable. We present MDPFuzz, the first blackbox fuzz testing framework for models solving MDPs. MDPFuzz forms testing oracles by checking whether the target model enters abnormal and dangerous states. During fuzzing, MDPFuzz decides which mutated state to retain by measuring if it can reduce cumulative rewards or form a new state sequence. We design efficient techniques to quantify the "freshness"of a state sequence using Gaussian mixture models (GMMs) and dynamic expectation-maximization (DynEM). We also prioritize states with high potential of revealing crashes by estimating the local sensitivity of target models over states. MDPFuzz is evaluated on five state-of-the-art models for solving MDPs, including supervised DNN, RL, IL, and multi-agent RL. Our evaluation includes scenarios of autonomous driving, aircraft collision avoidance, and two games that are often used to benchmark RL. During a 12-hour run, we find over 80 crash-triggering state sequences on each model. We show inspiring findings that crash-triggering states, though they look normal, induce distinct neuron activation patterns compared with normal states. We further develop an abnormal behavior detector to harden all the evaluated models and repair them with the findings of MDPFuzz to significantly enhance their robustness without sacrificing accuracy.

Author supplied keywords

Cite

CITATION STYLE

APA

Pang, Q., Yuan, Y., & Wang, S. (2022). MDPFuzz: Testing models solving Markov decision processes. In ISSTA 2022 - Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis (pp. 378–390). Association for Computing Machinery, Inc. https://doi.org/10.1145/3533767.3534388

MDPFuzz: Testing models solving Markov decision processes

Abstract

Author supplied keywords

Cite

Register to see more suggestions