Invernet: An Inversion Attack Framework to Infer Fine-Tuning Datasets through Word Embeddings

4Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.

Abstract

Word embedding aims to learn the dense representation of words and has become a regular input preparation in many NLP tasks. Due to the data and computation intensive nature of learning embeddings from scratch, a more affordable way is to borrow the pretrained embedding available in public and fine-tune the embedding through a domain specific downstream dataset. A privacy concern can arise if a malicious owner of the pretrained embedding gets access to the fine-tuned embedding and tries to infer the critical information from the downstream datasets. In this study, we propose a novel embedding inversion framework called Invernet that materializes the privacy concern by inferring the context distribution in the downstream dataset, which can lead to key information breach. With extensive experimental studies on two real-world news datasets: Antonio Gulli's News and New York Times, we validate the feasibility of proposed privacy attack and demonstrate the effectiveness of Invernet on inferring downstream datasets based on multiple word embedding methods.

Cite

CITATION STYLE

APA

Hayet, I., Yao, Z., & Luo, B. (2022). Invernet: An Inversion Attack Framework to Infer Fine-Tuning Datasets through Word Embeddings. In Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 5038–5047). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-emnlp.368

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free