Data-driven vs. dictionary-based word n-gram feature induction for sentiment analysis

2Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We address the question which word n-gram feature induction approach yields the most accurate discriminative model for machine learning-based sentiment analysis within a specific domain: a purely data-driven word n-gram feature induction or a word n-gram feature induction based on a domain-specific or domain-non-specific polarity dictionary. We evaluate both approaches in document-level polarity classification experiments in 2 languages, English and German, for 4 analog domains each: user-written product reviews on books, DVDs, electronics and music. We conclude that while dictionary-based feature induction leads to large dimensionality reductions, purely data-driven feature induction yields more accurate discriminative models. © 2013 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Remus, R., & Rill, S. (2013). Data-driven vs. dictionary-based word n-gram feature induction for sentiment analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8105 LNAI, pp. 176–183). https://doi.org/10.1007/978-3-642-40722-2_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free