Characterizing the Impacts of Instances on Robustness

Rui Zheng; Zhiheng Xi; Qin Liu; Wenbin Lai; Tao Gui; Qi Zhang; Xuanjing Huang; Jin Ma; Ying Shan; Weifeng Ge

Conference ProceedingsOPEN ACCESS

Characterizing the Impacts of Instances on Robustness

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2023) 2314-2332

DOI: 10.18653/v1/2023.findings-acl.146

4Citations

14Readers

Abstract

Building robust deep neural networks (DNNs) against adversarial attacks is an important but challenging task. Previous defense approaches mainly focus on developing new model structures or training algorithms, but they do little to tap the potential of training instances, especially instances with robust patterns carring innate robustness. In this paper, we show that robust and non-robust instances in the training dataset, though are both important for test performance, have contrary impacts on robustness, which makes it possible to build a highly robust model by leveraging the training dataset in a more effective way. We propose a new method that can distinguish robust instances from non-robust ones according to the model's sensitivity to perturbations on individual instances during training. Surprisingly, we find that the model under standard training easily overfits the robust instances by relying on their simple patterns before the model completely learns their robust features. Finally, we propose a new mitigation algorithm to further release the potential of robust instances. Experimental results show that proper use of robust instances in the original dataset is a new line to achieve highly robust models. Our codes are publicly available at https://github.com/ruizheng20/robust_data.

Cite

CITATION STYLE

APA

Zheng, R., Xi, Z., Liu, Q., Lai, W., Gui, T., Zhang, Q., … Ge, W. (2023). Characterizing the Impacts of Instances on Robustness. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 2314–2332). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.146

Characterizing the Impacts of Instances on Robustness

Abstract

Cite

Register to see more suggestions