Toward Privacy-preserving Text Embedding Similarity with Homomorphic Encryption

9Citations
Citations of this article
33Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Text embedding is an essential component to build efficient natural language applications based on text similarities such as search engines and chatbots. Certain industries like finance and healthcare demand strict privacy-preserving conditions that user's data should not be exposed to any potential malicious users even including service providers. From a privacy standpoint, text embeddings seem impossible to be interpreted but there is still a privacy risk that they can be recovered to original texts through inversion attacks. To satisfy such privacy requirements, in this paper, we study a Homomorphic Encryption (HE) based text similarity inference. To validate our method, we perform extensive experiments on two vital text similarity tasks. Through text embedding inversion tests, we prove that the benchmark datasets are vulnerable to inversion attacks and another privacy preserving approach, d?-privacy, a relaxed version of Local Differential Privacy method fails to prevent them. We show that our approach preserves the performance of models compared to that the baseline has degradation up to 10% of scores for the minimum security.

Cite

CITATION STYLE

APA

Kim, D., Lee, G., & Oh, S. (2022). Toward Privacy-preserving Text Embedding Similarity with Homomorphic Encryption. In FinNLP 2022 - 4th Workshop on Financial Technology and Natural Language Processing, Proceedings of the Workshop (pp. 25–36). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.finnlp-1.4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free