Abstract
Microblogs such as Twitter has made available a vast resource of User Generated Content (UGC) on which emotion analysis may be performed. Organizations increasingly value the opinions obtained from emotion analysis. These insights help to drive decision-making activities and provide constructive inputs to engage their customers and services. In a multiracial country such as Malaysia, it is common to find that tweets are written in mixed languages of Malay, Malaysian slang and English. These tweets increase the complexity of the emotion analysis task, especially considering that there is a serious lack of labeled data available in order to make use of supervised learning techniques. This paper explores the use of self-training, a semi-supervised technique that only requires a small initial labeled dataset to conduct emotion analysis of Malaysian code-mixed Twitter data. The results are promising as the accuracy achieved is higher compared to the baseline models.
Author supplied keywords
Cite
CITATION STYLE
Tan, K. S. N., Lim, T. M., & Lim, Y. M. (2020). Emotion analysis using self-training on malaysian code-mixed twitter data. In Proceedings of the 13th IADIS International Conference ICT, Society and Human Beings 2020, ICT 2020 and Proceedings of the 6th IADIS International Conference Connected Smart Cities 2020, CSC 2020 and Proceedings of the 17th IADIS International Conference Web Based Communities and Social Media 2020, WBC 2020 - Part of the 14th Multi Conference on Computer Science and Information Systems, MCCSIS 2020 (pp. 181–188). IADIS. https://doi.org/10.33965/ict_csc_wbc_2020_202008l022
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.