Investigating redundancy in emoji use: Study on a twitter based corpus

Giulia Donato; Patrizia Paggio

Conference ProceedingsOPEN ACCESS

Investigating redundancy in emoji use: Study on a twitter based corpus

EMNLP 2017 - 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, WASSA 2017 - Proceedings of the Workshop (2017) 118-126

DOI: 10.18653/v1/w17-5216

29Citations

104Readers

Abstract

In this paper we present an annotated corpus created with the aim of analyzing the informative behaviour of emoji - an issue of importance for sentiment analysis and natural language processing. The corpus consists of 2475 tweets all containing at least one emoji, which has been annotated using one of the three possible classes: Redundant, Non Redundant, and Non Redundant + POS. We explain how the corpus was collected, describe the annotation procedure and the interface developed for the task. We provide an analysis of the corpus, considering also possible predictive features, discuss the problematic aspects of the annotation, and suggest future improvements.

Cite

CITATION STYLE

APA

Donato, G., & Paggio, P. (2017). Investigating redundancy in emoji use: Study on a twitter based corpus. In EMNLP 2017 - 8th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis, WASSA 2017 - Proceedings of the Workshop (pp. 118–126). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w17-5216

Investigating redundancy in emoji use: Study on a twitter based corpus

Abstract

Cite

Register to see more suggestions