Sentiment analysis can be performed using machine learning algorithms to automatically identify the sentiment associated with reviews about products or services available online. In many sentiment analysis practical scenarios, it is necessary to classify reviews in rates between 1 to 5 stars – a multiclass problem. In literature, we found that the best results for reviews classification are those who propose solutions based on binary splits, achieving accuracies above 90 %. As such, we propose a model, based on the Nested Dichotomies algorithm, that performs multiclass classification in successive steps of binary classification operations. For this classifier to be more effective, we propose that the first split should be defined by identifying users’ recommendation threshold. We present a case study in which this classification model is applied to a set of subjective data extracted from TripAdvisor, discuss the process of determining the first split and evaluate the accuracy of the proposed model.
CITATION STYLE
Lunardi, A., Viterbo, J., Boscarioli, C., Bernardini, F., & Maciel, C. (2016). Domain-tailored multiclass classification of user reviews based on binary splits. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9742, pp. 298–309). Springer Verlag. https://doi.org/10.1007/978-3-319-39910-2_28
Mendeley helps you to discover research relevant for your work.