Importance of Text Data Preprocessing & Implementation in RapidMiner

Vaishali Kalra; Rashmi Aggarwal

Conference ProceedingsOPEN ACCESS

Importance of Text Data Preprocessing & Implementation in RapidMiner

Kalra V
Aggarwal R

Proceedings of the First International Conference on Information Technology and Knowledge Management (2018) 14 71-75

DOI: 10.15439/2017km46

N/ACitations

196Readers

Abstract

The user has requested enhancement of the downloaded file. Abstract-Data preparation is an important phase before applying any machine learning algorithms. Same with the text data before applying any machine learning algorithm on text data, it requires data preparation. The data preparation is done by data preprocessing. The preprocessing of text means cleaning of noise such as: cleaning of stop words, punctuation, terms which doesn't carry much weightage in context to the text, etc. In this paper, we describe in detail how to prepare data for machine learning algorithms using RapidMiner tool. This preprocessing is followed by conversion of bag of words into term vector model and describe about the various algorithms which can be applied in RapidMiner for data analysis and predictive modeling. We also discussed about the challenges and applications of text mining in recent days Index Terms-RapidMiner, Preprocessing of text, TF-IDF, Term Vector Model.

Cite

CITATION STYLE

APA

Kalra, V., & Aggarwal, R. (2018). Importance of Text Data Preprocessing & Implementation in RapidMiner. In Proceedings of the First International Conference on Information Technology and Knowledge Management (Vol. 14, pp. 71–75). PTI. https://doi.org/10.15439/2017km46

Importance of Text Data Preprocessing & Implementation in RapidMiner

Abstract

Cite

Register to see more suggestions