The Unprejudiced Stemmer to Prevent Etymological Behavior of Stemmed Morphemes Of Social Media Corpora

  • Rao* A
  • et al.
N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Sentiment Analysis is an application of Natural Langue Processing to analyze social media corpora to extract insights of corpora. Sentiment analytical results are the real feedback of the customers, which enables the organizations and companies to take appropriate decision on their products and business policies. Stemming plays in-evitable and vital role in sentiment analysis. Stemming is one of the phase of preprocessing the social media corpora. Today most of the researches uses strong stemmers to identify stem words of social media corpora. The most popular stemming algorithms such as Lancaster and Porter stemming algorithms causes prejudiced the meaning of the words. The over-stemmed words mislead the sentiment classification process. To prevent the over-stemming the Unprejudiced lighter stemming algorithm is proposed to sustain the meaning of the stemmed words. The propose Un-prejudiced algorithm uses lexical database and Parts of speech of Python Natural Language Tool Kit. There are a few stemming algorithm accuracy evaluation methods, in this paper we focused on Paice Error-rate relative to truncation (ERRT) measure to evaluate the accuracy of Lancaster, Porter and Unprejudiced stemming algorithms. The experiments were conducted on 25,758 source words and results were evaluated using Paice stem evaluation method and Sirsat method. The Paice Evaluation ERRT values 0.47209, 0.28703, 0.15502 of Lancaster, Porter, Unprejudiced respectively are proved that the Unprejudiced stemmer is more accurate than Lancaster and Porter. Sirsat’s stem evaluation method Average Words Conflation Factor (AWCF) results 10310.31, 14031.17, 23349.87 of Lancaster, Porter, Unprejudiced respectively are also proved the Unprejudiced stemming algorithm is more accurate than Lancaster and Porter stemming algorithms.

Cite

CITATION STYLE

APA

Rao*, A. . V. S. S. R., & P, Ranjana. (2019). The Unprejudiced Stemmer to Prevent Etymological Behavior of Stemmed Morphemes Of Social Media Corpora. International Journal of Innovative Technology and Exploring Engineering, 9(2), 3718–3724. https://doi.org/10.35940/ijitee.b6665.129219

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free