Abstract
For many NLP applications of online reviews, comparing two opinion-bearing sentences is the key. We argue that, while general purpose text similarity metrics have been applied for this purpose, there has been limited exploration of their applicability to opinion texts. We address this gap by studying: (1) how humans judge the similarity of pairs of opinion-bearing sentences; and, (2) the degree to which existing text similarity metrics, particularly embedding-based ones, correspond to human judgments. We crowdsourced annotations for opinion sentence pairs and our main findings are: (1) annotators tend to agree on whether or not opinion sentences are similar or different; and (2) embedding-based metrics capture human judgments of “opinion similarity” but not “opinion difference". Based on our analysis, we identify areas where the current metrics should be improved. We further propose to learn a similarity metric for opinion similarity via fine-tuning the Sentence-BERT sentence-embedding network based on review text and weak supervision by review ratings. Experiments show that our learned metric outperforms existing text similarity metrics and especially show significantly higher correlations with human annotations for differing opinions.
Cite
CITATION STYLE
Tay, W., Zhang, X., Wan, S., & Karimi, S. (2021). Measuring Similarity of Opinion-bearing Sentences. In 3rd Workshop on New Frontiers in Summarization, NewSum 2021 - Workshop Proceedings (pp. 74–84). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.newsum-1.9
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.