Exploring the Universal Vulnerability of Prompt-based Learning Paradigm

Lei Xu; Yangyi Chen; Ganqu Cui; Hongcheng Gao; Zhiyuan Liu

Conference ProceedingsOPEN ACCESS

Exploring the Universal Vulnerability of Prompt-based Learning Paradigm

Findings of the Association for Computational Linguistics: NAACL 2022 - Findings (2022) 1799-1810

DOI: 10.18653/v1/2022.findings-naacl.137

26Citations

79Readers

Abstract

Prompt-based learning paradigm bridges the gap between pre-training and fine-tuning, and works effectively under the few-shot setting. However, we find that this learning paradigm inherits the vulnerability from the pre-training stage, where model predictions can be misled by inserting certain triggers into the text. In this paper, we explore this universal vulnerability by either injecting backdoor triggers or searching for adversarial triggers on pre-trained language models using only plain text. In both scenarios, we demonstrate that our triggers can totally control or severely decrease the performance of prompt-based models fine-tuned on arbitrary downstream tasks, reflecting the universal vulnerability of the prompt-based learning paradigm. Further experiments show that adversarial triggers have good transferability among language models. We also find conventional fine-tuning models are not vulnerable to adversarial triggers constructed from pre-trained language models. We conclude by proposing a potential solution to mitigate our attack methods. Code and data are publicly available.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Xu, L., Chen, Y., Cui, G., Gao, H., & Liu, Z. (2022). Exploring the Universal Vulnerability of Prompt-based Learning Paradigm. In Findings of the Association for Computational Linguistics: NAACL 2022 - Findings (pp. 1799–1810). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-naacl.137

Readers' Seniority

PhD / Post grad / Masters / Doc 21

66%

Researcher 8

25%

Lecturer / Post doc 3

Readers' Discipline

Computer Science 26

84%

Mathematics 2

Linguistics 2

Energy 1

Exploring the Universal Vulnerability of Prompt-based Learning Paradigm

Abstract

References Powered by Scopus

Prefix-tuning: Optimizing continuous prompts for generation

The Power of Scale for Parameter-Efficient Prompt Tuning

Making pre-trained language models better few-shot learners

Cited by Powered by Scopus

Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models

Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP

NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline