Mojisem: Varying linguistic purposes of emoji in (Twitter) context

Noa Na'aman; Hannah Provenza; Orion Montoya

Conference ProceedingsOPEN ACCESS

Mojisem: Varying linguistic purposes of emoji in (Twitter) context

ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Student Research Workshop (2017) 136-141

DOI: 10.18653/v1/P17-3022

52Citations

174Readers

Abstract

Early research into emoji in textual communication has focused largely on high-frequency usages and ambiguity of interpretations. Investigation of a wide range of emoji usage shows these glyphs serving at least two very different purposes: as content and function words, or as multimodal affective markers. Identifying where an emoji is replacing textual content allows NLP tools the possibility of parsing them as any other word or phrase. Recognizing the import of non-content emoji can be a a significant part of understanding a message as well. We report on an annotation task on English Twitter data with the goal of classifying emoji uses by these categories, and on the effectiveness of a classifier trained on these annotations. We find that it is possible to train a classifier to tell the difference between those emoji used as linguistic content words and those used as paralinguistic or affective multimodal markers even with a small amount of training data, but that accurate sub-classification of these multimodal emoji into specific classes like attitude, topic, or gesture will require more data and more feature engineering.

Cite

CITATION STYLE

APA

Na’aman, N., Provenza, H., & Montoya, O. (2017). Mojisem: Varying linguistic purposes of emoji in (Twitter) context. In ACL 2017 - 55th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Student Research Workshop (pp. 136–141). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/P17-3022

Mojisem: Varying linguistic purposes of emoji in (Twitter) context

Abstract

Cite

Register to see more suggestions