With the development of technology, people’s daily lives are closely related to networks. The importance of cybersecurity protection draws global attention. Automated penetration testing is the novel method to protect the security of networks, which enhances efficiency and reduces costs compared with traditional manual penetration testing. Previous studies have provided many ways to obtain a better policy for penetration testing paths, but many studies are based on ideal penetration testing scenarios. In order to find potential vulnerabilities from the perspective of hackers in the real world, this paper models the process of black-box penetration testing as a Partially Observed Markov Decision Process (POMDP). In addition, we propose a new algorithm named (Formula presented.), which is applied to the automated black-box penetration testing. In the POMDP model, an agent interacts with a network environment to choose a better policy without insider information about the target network, except for the start points. To handle this problem, we utilize a Long Short-Term Memory (LSTM) structure empowering agent to make decisions based on historical memory. In addition, this paper enhances the current algorithm using the structure of the neural network, the calculation method of the Q-value, and adding noise parameters to the neural network to advance the generalization and efficiency of this algorithm. In the last section, we conduct comparison experiments of the (Formula presented.) algorithm and other recent state-of-the-art (SOTA) algorithms. The experimental results vividly show that this novel algorithm is able to find a greater attack-path strategy for all vulnerable hosts in the automated black-box penetration testing. Additionally, the generalization and robustness of this algorithm are far superior to other SOTA algorithms in different size simulation scenarios based on the CyberBattleSim simulation developed by Microsoft.
CITATION STYLE
Zhang, Y., Liu, J., Zhou, S., Hou, D., Zhong, X., & Lu, C. (2022). Improved Deep Recurrent Q-Network of POMDPs for Automated Penetration Testing. Applied Sciences (Switzerland), 12(20). https://doi.org/10.3390/app122010339
Mendeley helps you to discover research relevant for your work.