DQfD-AIPT: An Intelligent Penetration Testing Framework Incorporating Expert Demonstration Data

9Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The application of reinforcement learning (RL) methods of artificial intelligence for penetration testing (PT) provides a solution to the current problems of high labour costs and high reliance on expert knowledge for manual PT. In order to improve the efficiency of RL algorithms for PT, existing research has considered bringing in the knowledge of PT experts and combining it with the use of imitative learning methods to guide the agent in its decision-making. However, the disadvantage of using imitation learning is also obvious; that is, the performance of the strategies learned by the agent hardly exceeds the demonstrated behaviour of the expert and it can also cause expert knowledge overfitting. At the same time, the expert knowledge in the currently proposed method is poorly interpretable and highly scenario-dependent. The expert knowledge used in these methods is not universal. To address these issues, we propose an intelligent PT framework named DQfD-AIPT. The framework encompasses the process of collecting and using expert knowledge and provides a rational definition of the structure of expert knowledge. To solve the overfitting problem, we perform PT path planning based on the deep Q-learning from demonstrations (DQfD) algorithm. DQfD combines the benefits of RL and imitation learning to effectively improve the PT strategy and performance of agents while avoiding overfitting. Finally, we conducted experiments in a simulated network scenario containing honeypots. The experimental results proved the effectiveness of expert knowledge incorporation. In addition, the DQfD algorithm can improve the efficiency of penetration testing more effectively than that by the classical deep reinforcement learning (DRL) method and can obtain a higher cumulative reward. Not only that, due to the incorporation of expert knowledge, in scenarios with honeypots, the DQfD method can effectively reduce the probability of interacting with honeypots compared to the classical DRL method.

Cite

CITATION STYLE

APA

Wang, Y., Li, Y., Xiong, X., Zhang, J., Yao, Q., & Shen, C. (2023). DQfD-AIPT: An Intelligent Penetration Testing Framework Incorporating Expert Demonstration Data. Security and Communication Networks, 2023. https://doi.org/10.1155/2023/5834434

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free