Most previous Sentiment Analysis (SA) work has focused on English with considerable success. In this work, we focus on studying SA in Arabic, as a less-resourced language. SA in Arabic has been previously addressed in the literature, but has targeted text genres of more formal/edited domains (e.g. news-wire) and domains containing longer text instances, i.e. with more contextual information (e.g. reviews). That is, less work has focused on SA in Arabic for a noisy and short-length text genre, like micro-blogs. In addition, the time-changing nature of streaming data (e.g. the Twitter stream) has not been considered in previous work, as SA systems were mainly developed and evaluated on small test-sets that are sub-sets of the original data-set used for training. This work reports on a wide set of investigations for SA in Arabic tweets, systematically comparing two existing approaches that have been shown to be successful in English. Unlike previous work, we benchmark the trained models against an independent test-set of >3.5k instances collected at different points in time to account for topic-shifts issues in the Twitter stream. Despite the challenging noisy medium of Twitter and the mixed use of Dialectal and Standard forms of Arabic, we show that our SA systems are able to attain performance scores on Arabic tweets that are comparable to the state-of-the-art SA systems for English tweets.
CITATION STYLE
Refaee, E. (2017). Sentiment analysis for micro-blogging platforms in arabic. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10283 LNCS, pp. 275–294). Springer Verlag. https://doi.org/10.1007/978-3-319-58562-8_22
Mendeley helps you to discover research relevant for your work.