Disarming Attacks Inside Neural Network Models

Ran Dubin

Journal ArticleOPEN ACCESS

Disarming Attacks Inside Neural Network Models

Dubin R

IEEE Access (2023) 11 124295-124303

DOI: 10.1109/ACCESS.2023.3330141

4Citations

17Readers

Abstract

Similar to the revolution of open source code sharing, Artificial Intelligence (AI) model sharing is gaining increased popularity. However, the fast adaptation in the industry, lack of awareness, and ability to exploit the models make them significant attack vectors. By embedding malware in neurons, the malware can be delivered covertly, with minor or no impact on the neural network's performance. The covert attack will use the Least Significant Bits (LSB) weight attack since LSB has a minimal effect on the model accuracy, and as a result, the user will not notice it. Since there are endless ways to hide the attacks, we focus on a zero-trust prevention strategy based on AI model attack disarm and reconstruction. We proposed three types of model steganography weight disarm defense mechanisms. The first two are based on random bit substitution noise, and the other on model weight quantization. We demonstrate a 100% prevention rate while the methods introduce a minimal decrease in model accuracy based on Qint8 and K-LRBP methods, which is an essential factor for improving AI security.

Author supplied keywords

Cite

CITATION STYLE

APA

Dubin, R. (2023). Disarming Attacks Inside Neural Network Models. IEEE Access, 11, 124295–124303. https://doi.org/10.1109/ACCESS.2023.3330141

Disarming Attacks Inside Neural Network Models

Abstract

Author supplied keywords

Cite

Register to see more suggestions