Abstract
Throughout the past decade, extensive research on text classification has produced fast and accurate algorithms. Most of these algorithms are based on bag-of-words representations, which generate high dimensional data. However, just a few supervised learning methods, such as SVMs, can efficiently handle high dimensional data. To overcome this limitation we propose the use of prior polarity words (PPW) in order to create a compact and representative feature set for financial news classification. Using this approach it is possible to reduce feature sets from thousands to less than tens of features without compromising the accuracy of the text classifier. We measured accuracy, precision, recall, F-measure and ROC AUC of text classifiers using PPW. Classifier using PPW was able to topping all results when compared with a wide range of feature selection methods. By adopting PPW, Support Vector Machines and Naive Bayes performed consistently better than using the full feature set. PPW also turned Naive Bayes comparable to SVMs, as indicated by the improved performance scores in all measures tested.
Author supplied keywords
Cite
CITATION STYLE
Campos, E., & Matsubara, E. (2014). An experimental evaluation of sentiment analysis on financial news using prior polarity words. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8864, 218–228. https://doi.org/10.1007/978-3-319-12027-0_18
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.