Leave-One-Outscores provide estimates of feature importance in neural networks for adversarial attacks. In this work, we present context-free word scores as a query-efficient alternative. Experiments show that these approximations are quite effective for black-box attacks on neural networks trained for text classification, particularly for CNNs. The model query count for this method scales as O(vocab_size *model_input_length). It is independent of the number of examples and features to be perturbed.
CITATION STYLE
Shakeel, N., & Shakeel, S. (2022). Context-Free Word Importance Scores for Attacking Neural Networks. Journal of Computational and Cognitive Engineering, 1(4), 187–192. https://doi.org/10.47852/bonviewJCCE2202406
Mendeley helps you to discover research relevant for your work.