The user has requested enhancement of the downloaded file. Abstract-Data preparation is an important phase before applying any machine learning algorithms. Same with the text data before applying any machine learning algorithm on text data, it requires data preparation. The data preparation is done by data preprocessing. The preprocessing of text means cleaning of noise such as: cleaning of stop words, punctuation, terms which doesn't carry much weightage in context to the text, etc. In this paper, we describe in detail how to prepare data for machine learning algorithms using RapidMiner tool. This preprocessing is followed by conversion of bag of words into term vector model and describe about the various algorithms which can be applied in RapidMiner for data analysis and predictive modeling. We also discussed about the challenges and applications of text mining in recent days Index Terms-RapidMiner, Preprocessing of text, TF-IDF, Term Vector Model.
CITATION STYLE
Kalra, V., & Aggarwal, R. (2018). Importance of Text Data Preprocessing & Implementation in RapidMiner. In Proceedings of the First International Conference on Information Technology and Knowledge Management (Vol. 14, pp. 71–75). PTI. https://doi.org/10.15439/2017km46
Mendeley helps you to discover research relevant for your work.