InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings

16Citations
Citations of this article
38Readers
Mendeley users who have this article in their library.

Abstract

Contrastive learning has been extensively studied in sentence embedding learning, which assumes that the embeddings of different views of the same sentence are closer. The constraint brought by this assumption is weak, and a good sentence representation should also be able to reconstruct the original sentence fragments. Therefore, this paper proposes an information-aggregated contrastive learning framework for learning unsupervised sentence embeddings, termed InfoCSE. InfoCSE forces the representation of [CLS] positions to aggregate denser sentence information by introducing an additional Masked language model task and a well-designed network. We evaluate the proposed InfoCSE on several benchmark datasets w.r.t the semantic text similarity (STS) task. Experimental results show that InfoCSE outperforms SimCSE by an average Spearman correlation of 2.60% on BERT-base, and 1.77% on BERT-large, achieving state-of-the-art results among unsupervised sentence representation learning methods. Our code are available at github.com/caskcsg/sentemb/tree/main/InfoCSE.

References Powered by Scopus

GloVe: Global vectors for word representation

26882Citations
11161Readers

Mining and summarizing customer reviews

6316Citations
3328Readers
Get full text

Aligning books and movies: Towards story-like visual explanations by watching movies and reading books

1677Citations
927Readers
Get full text

Cited by Powered by Scopus

Broadening the View: Demonstration-augmented Prompt Learning for Conversational Recommendation

4Citations
6Readers
Get full text

Some Like It Small: Czech Semantic Embedding Models for Industry Applications

2Citations
8Readers
1Citations
11Readers

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Wu, X., Gao, C., Lin, Z., Han, J., Wang, Z., & Hu, S. (2022). InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings. In Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 3060–3070). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-emnlp.223

Readers over time

‘22‘23‘24‘2505101520

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 7

47%

Researcher 6

40%

Professor / Associate Prof. 1

7%

Lecturer / Post doc 1

7%

Readers' Discipline

Tooltip

Computer Science 14

82%

Neuroscience 1

6%

Physics and Astronomy 1

6%

Engineering 1

6%

Save time finding and organizing research with Mendeley

Sign up for free
0