C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed Social Media Text using Feature Engineering

3Citations
Citations of this article
69Readers
Mendeley users who have this article in their library.

Abstract

In today's interconnected and multilingual world, code-mixing of languages on social media is a common occurrence. While many Natural Language Processing (NLP) tasks like sentiment analysis are mature and well designed for monolingual text, techniques to apply these tasks to code-mixed text still warrant exploration. This paper describes our feature engineering approach to sentiment analysis in code-mixed social media text for SemEval-2020 Task 9: SentiMix. We tackle this problem by leveraging a set of hand-engineered lexical, sentiment, and metadata features to design a classifier that can disambiguate between “positive”, “negative” and “neutral” sentiment. With this model we are able to obtain a weighted F1 score of 0.65 for the “Hinglish” task and 0.63 for the “Spanglish” tasks.

Cite

CITATION STYLE

APA

Advani, L., Lu, C., & Maharjan, S. (2020). C1 at SemEval-2020 Task 9: SentiMix: Sentiment Analysis for Code-Mixed Social Media Text using Feature Engineering. In 14th International Workshops on Semantic Evaluation, SemEval 2020 - co-located 28th International Conference on Computational Linguistics, COLING 2020, Proceedings (pp. 1227–1232). International Committee for Computational Linguistics. https://doi.org/10.18653/v1/2020.semeval-1.163

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free