Will go at SemEval-2020 Task 9: An Accurate Approach for Sentiment Analysis on Hindi-English Tweets Based on Bert and pseudo Label Strategy

1Citations
Citations of this article
70Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Mixing languages are widely used in social media, especially in multilingual societies like India. Detecting the emotions contained in these languages, which is of great significance to the development of society and political trends. In this paper, we propose an ensemble of pseudo-label based Bert model and TFIDF based SGDClassifier model to identify the sentiments of Hindi-English (Hi-En) code-mixed data. The ensemble model combines the strengths of rich semantic information from the Bert model and word frequency information from the probabilistic ngram model to predict the sentiment of a given code-mixed tweet. Finally, our team got an average F1 score of 0.686 on the final leaderboard, and our codalab username is will go.

Cite

CITATION STYLE

APA

Bao, W., Chen, W., Bai, W., Zhuang, Y., Cheng, M., & Ma, X. (2020). Will go at SemEval-2020 Task 9: An Accurate Approach for Sentiment Analysis on Hindi-English Tweets Based on Bert and pseudo Label Strategy. In 14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings (pp. 1348–1353). International Committee for Computational Linguistics. https://doi.org/10.18653/v1/2020.semeval-1.182

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free