ERATE: Efficient Retrieval Augmented Text Embeddings

3Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

Embedding representations of text are useful for downstream natural language processing tasks. Several universal sentence representation methods have been proposed with a particular focus on self-supervised pre-training approaches to leverage the vast quantities of unlabelled data. However, there are two challenges for generating rich embedding representations for a new document. 1) The latest rich embedding generators are based on very large costly transformer-based architectures. 2) The rich embedding representation of a new document is limited to only the information provided without access to any explicit contextual and temporal information that could potentially further enrich the representation. We propose efficient retrieval-augmented text embeddings (ERATE) that tackles the first issue and offers a method to tackle the second issue. To the best of our knowledge, we are the first to incorporate retrieval to general purpose embeddings as a new paradigm, which we apply to the semantic similarity tasks of SentEval. Despite not reaching state-of-the-art performance, ERATE offers key insights that encourages future work into investigating the potential of retrieval-based embeddings.

Cite

CITATION STYLE

APA

Raina, V., Kassner, N., Popat, K., Lewis, P., Cancedda, N., & Martin, L. (2023). ERATE: Efficient Retrieval Augmented Text Embeddings. In ACL 2023 - 4th Workshop on Insights from Negative Results in NLP, Proceedings (pp. 11–18). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.insights-1.2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free