SENTIX: A Sentiment-Aware Pre-Trained Model for Cross-Domain Sentiment Analysis

84Citations
Citations of this article
130Readers
Mendeley users who have this article in their library.

Abstract

Pre-trained language models have been widely applied to cross-domain NLP tasks like sentiment analysis, achieving state-of-the-art performance. However, due to the variety of users’ emotional expressions across domains, fine-tuning the pre-trained models on the source domain tends to overfit, leading to inferior results on the target domain. In this paper, we pre-train a sentiment-aware language model (SENTIX) via domain-invariant sentiment knowledge from large-scale review datasets, and utilize it for cross-domain sentiment analysis task without fine-tuning. We propose several pre-training tasks based on existing lexicons and annotations at both token and sentence levels, such as emoticons, sentiment words, and ratings, without human interference. A series of experiments are conducted and the results indicate the great advantages of our model. We obtain new state-of-the-art results in all the cross-domain sentiment analysis tasks, and our proposed SENTIX can be trained with only 1% samples (18 samples) and it achieves better performance than BERT with 90% samples. Code is available at https://github.com/12190143/SentiX.

Cite

CITATION STYLE

APA

Zhou, J., Tian, J., Wang, R., Wu, Y., Xiao, W., & He, L. (2020). SENTIX: A Sentiment-Aware Pre-Trained Model for Cross-Domain Sentiment Analysis. In COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (pp. 568–579). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.coling-main.49

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free