Toward Interpretable Semantic Textual Similarity via Optimal Transport-based Contrastive Sentence Learning

13Citations
Citations of this article
67Readers
Mendeley users who have this article in their library.

Abstract

Recently, finetuning a pretrained language model to capture the similarity between sentence embeddings has shown the state-of-the-art performance on the semantic textual similarity (STS) task. However, the absence of an interpretation method for the sentence similarity makes it difficult to explain the model output. In this work, we explicitly describe the sentence distance as the weighted sum of contextualized token distances on the basis of a transportation problem, and then present the optimal transport-based distance measure, named RCMD; it identifies and leverages semantically-aligned token pairs. In the end, we propose CLRCMD, a contrastive learning framework that optimizes RCMD of sentence pairs, which enhances the quality of sentence similarity and their interpretation. Extensive experiments demonstrate that our learning framework outperforms other baselines on both STS and interpretable-STS benchmarks, indicating that it computes effective sentence similarity and also provides interpretation consistent with human judgement. The code and checkpoint are publicly available at https://github.com/sh0416/clrcmd.

Cite

CITATION STYLE

APA

Lee, S., Lee, D., Jang, S., & Yu, H. (2022). Toward Interpretable Semantic Textual Similarity via Optimal Transport-based Contrastive Sentence Learning. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1, pp. 5969–5979). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.acl-long.412

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free