Building large arabic multi-domain resources for sentiment analysis

152Citations
Citations of this article
109Readers
Mendeley users who have this article in their library.
Get full text

Abstract

While there has been a recent progress in the area of Arabic SentimentAnalysis, most of the resources in this area are either of limited size, domainspecific or not publicly available. In this paper, we address this problemby generating large multi-domain datasets for Sentiment Analysis in Arabic.The datasets were scrapped from different reviewing websites and consist of atotal of 33K annotated reviews for movies, hotels, restaurants and products.Moreover we build multi-domain lexicons from the generated datasets. Differentexperiments have been carried out to validate the usefulness of the datasetsand the generated lexicons for the task of sentiment classification. From the experimentalresults, we highlight some useful insights addressing: the best performingclassifiers and feature representation methods, the effect of introducinglexicon based features and factors affecting the accuracy of sentiment classificationin general. All the datasets, experiments code and results have been madepublicly available for scientific purposes.

Cite

CITATION STYLE

APA

ElSahar, H., & El-Beltagy, S. R. (2015). Building large arabic multi-domain resources for sentiment analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9042, pp. 23–34). Springer Verlag. https://doi.org/10.1007/978-3-319-18117-2_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free