A comparative study on data cleaning approaches in sentiment analysis

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Sentiment analysis has become an important opinion mining technique; in recent years, it becomes one of the most interesting fields in artificial intelligence. Pre-processing is considered as a significant stage in sentiment analysis, but it is not given much attention in the literature or models. The data which are collected from different sources might contain redundant and duplicates; it needs to undergo some detection process for any occurrence of redundancy in the datasets. This paper reviews, analyzes, and compares different data cleaning algorithms such as DySNI, PSNM, and brushing for identifying redundancy in the datasets. Further, it analyzed the effects of general data cleaning methods to enhance accuracy when it is applied to different classifiers. The result reveals that the DySNI algorithm gives the highest accuracy and the brushing algorithm (BAA-DD) helps to reduce the dataset size to a greater extent. Further, applying negation replacement and acronym expansion techniques helps to enhance the accuracy level.

Cite

CITATION STYLE

APA

Mohamed Zakir, H., & Vinila Jinny, S. (2020). A comparative study on data cleaning approaches in sentiment analysis. In Lecture Notes in Electrical Engineering (Vol. 656, pp. 421–431). Springer. https://doi.org/10.1007/978-981-15-3992-3_35

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free