Reward Shaping for Happier Autonomous Cyber Security Agents

12Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

As machine learning models become more capable, they have exhibited increased potential in solving complex tasks. One of the most promising directions uses deep reinforcement learning to train autonomous agents in computer network defense tasks. This work studies the impact of the reward signal that is provided to the agents when training for this task. Due to the nature of cybersecurity tasks, the reward signal is typically 1) in the form of penalties (e.g., when a compromise occurs), and 2) distributed sparsely across each defense episode. Such reward characteristics are atypical of classic reinforcement learning tasks where the agent is regularly rewarded for progress (cf.To getting occasionally penalized for failures). We investigate reward shaping techniques that could bridge this gap so as to enable agents to train more sample-efficiently and potentially converge to a better performance. We first show that deep reinforcement learning algorithms are sensitive to the magnitude of the penalties and their relative size. Then, we combine penalties with positive external rewards and study their effect compared to penalty-only training. Finally, we evaluate intrinsic curiosity as an internal positive reward mechanism and discuss why it might not be as advantageous for high-level network monitoring tasks.

Cite

CITATION STYLE

APA

Bates, E., Mavroudis, V., & Hicks, C. (2023). Reward Shaping for Happier Autonomous Cyber Security Agents. In AISec 2023 - Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security (pp. 221–232). Association for Computing Machinery, Inc. https://doi.org/10.1145/3605764.3623916

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free