Data-driven vs. dictionary-based word n-gram feature induction for sentiment analysis

Robert Remus; Sven Rill

Conference Proceedings

Data-driven vs. dictionary-based word n-gram feature induction for sentiment analysis

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 8105 LNAI 176-183

DOI: 10.1007/978-3-642-40722-2_18

2Citations

10Readers

Get full text

Abstract

We address the question which word n-gram feature induction approach yields the most accurate discriminative model for machine learning-based sentiment analysis within a specific domain: a purely data-driven word n-gram feature induction or a word n-gram feature induction based on a domain-specific or domain-non-specific polarity dictionary. We evaluate both approaches in document-level polarity classification experiments in 2 languages, English and German, for 4 analog domains each: user-written product reviews on books, DVDs, electronics and music. We conclude that while dictionary-based feature induction leads to large dimensionality reductions, purely data-driven feature induction yields more accurate discriminative models. © 2013 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Remus, R., & Rill, S. (2013). Data-driven vs. dictionary-based word n-gram feature induction for sentiment analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8105 LNAI, pp. 176–183). https://doi.org/10.1007/978-3-642-40722-2_18

Data-driven vs. dictionary-based word n-gram feature induction for sentiment analysis

Abstract

Author supplied keywords

Cite

Register to see more suggestions