QFlip: An Adaptive Reinforcement Learning Strategy for the FlipIt Security Game

9Citations
Citations of this article
22Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A rise in Advanced Persistent Threats (APTs) has introduced a need for robustness against long-running, stealthy attacks which circumvent existing cryptographic security guarantees. FlipIt is a security game that models attacker-defender interactions in advanced scenarios such as APTs. Previous work analyzed extensively non-adaptive strategies in FlipIt, but adaptive strategies rise naturally in practical interactions as players receive feedback during the game. We model the FlipIt game as a Markov Decision Process and introduce QFlip, an adaptive strategy for FlipIt based on temporal difference reinforcement learning. We prove theoretical results on the convergence of our new strategy against an opponent playing with a Periodic strategy. We confirm our analysis experimentally by extensive evaluation of QFlip against specific opponents. QFlip converges to the optimal adaptive strategy for Periodic and Exponential opponents using associated state spaces. Finally, we introduce a generalized QFlip strategy with composite state space that outperforms a Greedy strategy for several distributions including Periodic and Uniform, without prior knowledge of the opponent’s strategy. We also release an OpenAI Gym environment for QFlip to facilitate future research.

Cite

CITATION STYLE

APA

Oakley, L., & Oprea, A. (2019). QFlip: An Adaptive Reinforcement Learning Strategy for the FlipIt Security Game. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11836 LNCS, pp. 364–384). Springer. https://doi.org/10.1007/978-3-030-32430-8_22

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free