The goal of this paper is to compare existing sentiment analysis models, namely Doc2Vec and Recursive Neural Tensor Network, when applied to a skewed class corpus. Such setting is not uncommon, but the literature lacks results on it. We used two techniques to create more balance between classes: under-sampling and over-sampling the target corpora. Doc2Vec achieved the best result overall on the skewed classes, but performed poorly over small and sampled configurations. RNTN achieved the best result on the over-sampled corpus. The Naive Bayes baseline was not surpassed in the under-sampled corpus with Pos/Neg classes, which was the smallest corpus configuration.
CITATION STYLE
Brum, H., Araujo, F., & Kepler, F. (2016). Sentiment analysis for Brazilian Portuguese over a skewed class corpora. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9727, pp. 134–138). Springer Verlag. https://doi.org/10.1007/978-3-319-41552-9_14
Mendeley helps you to discover research relevant for your work.