Locality Preserving Sentence Encoding

3Citations
Citations of this article
52Readers
Mendeley users who have this article in their library.

Abstract

Although researches on word embeddings have made great progress in recent years, many tasks in natural language processing are on the sentence level. Thus, it is essential to learn sentence embeddings. Recently, Sentence BERT (SBERT) is proposed to learn embeddings on the sentence level, and it uses the inner product (or, cosine similarity) to compute semantic similarity between sentences. However, this measurement cannot well describe the semantic structures among sentences. The reason is that sentences may lie on a manifold in the ambient space rather than distribute in a Euclidean space. Thus, cosine similarity cannot approximate distances on the manifold. To tackle the severe problem, we propose a novel sentence embedding method called Sentence BERT with Locality Preserving (SBERT-LP), which discovers the sentence submanifold from a high-dimensional space and yields a compact sentence representation subspace by locally preserving geometric structures of sentences. We compare the SBERT-LP with several existing sentence embedding approaches from three perspectives: sentence similarity, sentence classification, and sentence clustering. Experimental results and case studies demonstrate that our method encodes sentences better in the sense of semantic structures.

Cite

CITATION STYLE

APA

Min, C., Chu, Y., Yang, L., Xu, B., & Lin, H. (2021). Locality Preserving Sentence Encoding. In Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (pp. 3050–3060). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-emnlp.262

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free