TAG: Gradient Attack on Transformer-based Language Models

Jieren Deng; Yijue Wan; Ji Li; Chenghong Wang; Chao Shang; Hang Liu; Sanguthevar Rajasekaran; Caiwen Ding

Conference ProceedingsOPEN ACCESS

TAG: Gradient Attack on Transformer-based Language Models

Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (2021) 3600-3610

DOI: 10.18653/v1/2021.findings-emnlp.305

16Citations

59Readers

Abstract

Although distributed learning has increasingly gained attention in terms of effectively utilizing local devices for data privacy enhancement, recent studies show that publicly shared gradients in the training process can reveal the private training data (gradient leakage) to a third party. However, so far there hasn't been any systematic study of the gradient leakage mechanism of the Transformer based language models. In this paper, as the first attempt, we formulate the gradient attack problem on the Transformer-based language models and propose a gradient attack algorithm, TAG, to recover the local training data. Experimental results on Transformer, TinyBERT4, TinyBERT6, BERTBASE, and BERTLARGE using GLUE benchmark show that compared with DLG (Zhu et al., 2019), TAG works well on more weight distributions in recovering private training data and achieves 1.5× Recover Rate and 2.5× ROUGE-2 over prior methods without the need of ground truth label. TAG can obtain up to 88.9% tokens and up to 0.93 cosine similarity in token embeddings from private training data by attacking gradients on CoLA dataset. In addition, TAG is stronger than previous approaches on larger models, smaller dictionary size, and smaller input length.

Cite

CITATION STYLE

APA

Deng, J., Wan, Y., Li, J., Wang, C., Shang, C., Liu, H., … Ding, C. (2021). TAG: Gradient Attack on Transformer-based Language Models. In Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (pp. 3600–3610). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-emnlp.305

TAG: Gradient Attack on Transformer-based Language Models

Abstract

Cite

Register to see more suggestions