Importance of Text Data Preprocessing & Implementation in RapidMiner

  • Kalra V
  • Aggarwal R
N/ACitations
Citations of this article
196Readers
Mendeley users who have this article in their library.

Abstract

The user has requested enhancement of the downloaded file. Abstract-Data preparation is an important phase before applying any machine learning algorithms. Same with the text data before applying any machine learning algorithm on text data, it requires data preparation. The data preparation is done by data preprocessing. The preprocessing of text means cleaning of noise such as: cleaning of stop words, punctuation, terms which doesn't carry much weightage in context to the text, etc. In this paper, we describe in detail how to prepare data for machine learning algorithms using RapidMiner tool. This preprocessing is followed by conversion of bag of words into term vector model and describe about the various algorithms which can be applied in RapidMiner for data analysis and predictive modeling. We also discussed about the challenges and applications of text mining in recent days Index Terms-RapidMiner, Preprocessing of text, TF-IDF, Term Vector Model.

Cite

CITATION STYLE

APA

Kalra, V., & Aggarwal, R. (2018). Importance of Text Data Preprocessing & Implementation in RapidMiner. In Proceedings of the First International Conference on Information Technology and Knowledge Management (Vol. 14, pp. 71–75). PTI. https://doi.org/10.15439/2017km46

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free