Given their generalization capabilities,deep learning algorithms may represent a powerful weapon in the arsenal of antivirus developers. Nevertheless, recent works in different domains (e.g., computer vision) have shown that such algorithms are susceptible to backdooring attacks, namely training-Time attacks that aim toteach a deep neural network to misclassify inputs containing a specific trigger. This work investigates the resilience of deep learning models for malware detection against backdooring attacks. In particular, we devise two classes of attacks for backdooring a malware detector that targets the update process of the underlying deep learning classifier. While the first and most straightforward approach relies onsuperficial triggers made of static byte sequences, the second attack we propose employslatent triggers, namely specific feature configurations in the latent space of the model. The latent triggers may be produced by different byte sequences in the binary inputs, rendering the triggerdynamic in the input space and thus more challenging to detect. We evaluate the resilience of two state-of-The-Art convolutional neural networks for malware detection against both strategies and under different threat models. Our results indicate that the models do not easily learn superficial triggers in aclean label setting, even when allowing a high rate (\geq 30%) of poisoning samples. Conversely, an attacker manipulating the training labels (\textitdirty label attack) can implant an effective backdoor that activates with a superficial, static trigger into both models. The results obtained from the experimental evaluation carried out on the latent trigger attack instead show that the knowledge of the adversary on the target classifier may influence the success of the attack. Assuming perfect knowledge, an attacker can implant a backdoor that perfectly activates in 100% of the cases with a poisoning rate as low as 0.1% of the whole updating dataset (namely, 32 poisoning samples in a dataset of 32000 elements). Lastly, we experiment with two known defensive techniques that were shown effective against other backdooring attacks in the malware domain. However, none proved reliable in detecting the backdoor or triggered samples created by our latent space attack. We then discuss some modifications to those techniques that may render them effective against latent backdooring attacks.
CITATION STYLE
D’Onghia, M., Di Cesare, F., Gallo, L., Carminati, M., Polino, M., & Zanero, S. (2023). Lookin’ Out My Backdoor! Investigating Backdooring Attacks Against DL-driven Malware Detectors. In AISec 2023 - Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security (pp. 209–220). Association for Computing Machinery, Inc. https://doi.org/10.1145/3605764.3623919
Mendeley helps you to discover research relevant for your work.