Previous research on sentiment analysis mainly focuses on binary or ternary sentiment analysis in monolingual texts. However, in today's social media such as micro-blogs, emotions are often expressed in bilingual or multilingual text called code-switching text, and people's emotions are complex, including happiness, sadness, angry, afraid, surprise, etc. Different emotions may exist together, and the proportion of each emotion in the code-switching text is often unbalanced. Inspired by the recently proposed BERT model, we investigate how to fine-tune BERT for multi-label sentiment analysis in code-switching text in this paper. Our investigation includes the selection of pre-trained models and the fine-tuning methods of BERT on this task. To deal with the problem of the unbalanced distribution of emotions, a method based on data augmentation, undersampling and ensemble learning is proposed to get balanced samples and train different multi-label BERT classifiers. Our model combines the prediction of each classifier to get the final outputs. The experiment on the dataset of NLPCC 2018 shared task 1 shows the effectiveness of our model for the unbalanced code-switching text. The F1-Score of our model is higher than many previous models.
CITATION STYLE
Tang, T., Tang, X., & Yuan, T. (2020). Fine-Tuning BERT for Multi-Label Sentiment Analysis in Unbalanced Code-Switching Text. IEEE Access, 8, 193248–193256. https://doi.org/10.1109/ACCESS.2020.3030468
Mendeley helps you to discover research relevant for your work.