Sentiment analysis of customer reviews: Balanced versus unbalanced datasets

Nicola Burns; Yaxin Bi; Hui Wang; Terry Anderson

Conference Proceedings

Sentiment analysis of customer reviews: Balanced versus unbalanced datasets

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6881 LNAI(PART 1) 161-170

DOI: 10.1007/978-3-642-23851-2_17

20Citations

32Readers

Get full text

Abstract

More people are buying products online and expressing their opinions on these products through online reviews. Sentiment analysis can be used to extract valuable information from reviews, and the results can benefit both consumers and manufacturers. This research shows a study which compares two well known machine learning algorithms namely, dynamic language model and naïve Bayes classifier. Experiments have been carried out to determine the consistency of results when the datasets are of different sizes and also the effect of a balanced or unbalanced dataset. The experimental results indicate that both the algorithms over a realistic unbalanced dataset can achieve better results than the balanced datasets commonly used in research. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Burns, N., Bi, Y., Wang, H., & Anderson, T. (2011). Sentiment analysis of customer reviews: Balanced versus unbalanced datasets. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6881 LNAI, pp. 161–170). https://doi.org/10.1007/978-3-642-23851-2_17

Sentiment analysis of customer reviews: Balanced versus unbalanced datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions