Evaluating the Accuracy and Efficiency of Sentiment Analysis Pipelines with UIMA

1Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Sentiment analysis methods co-ordinate text mining components, such as sentence splitters, tokenisers and classifiers, into pipelined applications to automatically analyse the emotions or sentiment expressed in textual content. However, the performance of sentiment analysis pipelines is known to be substantially affected by the constituent components. In this paper, we leverage the Unstructured Information Management Architecture (UIMA) to seamlessly co-ordinate components into sentiment analysis pipelines. We then evaluate a wide range of different combinations of text mining components to identify optimal settings. More specifically, we evaluate different pre-processing components, e.g. tokenisers and stemmers, feature weighting schemes, e.g. TF and TFIDF, feature types, e.g. bigrams, trigrams and bigrams+trigrams, and classification algorithms, e.g. Support Vector Machines, Random Forest and Naive Bayes, against 6 publicly available datasets. The results demonstrate that optimal configurations are consistent across the 6 datasets while our UIMA-based pipeline yields a robust performance when compared to baseline methods.

Cite

CITATION STYLE

APA

Altrabsheh, N., Kontonatsios, G., & Korkontzelos, Y. (2019). Evaluating the Accuracy and Efficiency of Sentiment Analysis Pipelines with UIMA. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11608 LNCS, pp. 286–294). Springer Verlag. https://doi.org/10.1007/978-3-030-23281-8_23

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free