Data Pre Processing for Machine Learning Models using Python Libraries

undefined; undefined; undefined; Namrata Pandey; Pawan Kumar Patnaik; Sargam Gupta

Journal Article

Data Pre Processing for Machine Learning Models using Python Libraries

et al.

International Journal of Engineering and Advanced Technology (2020) 9(4) 1995-1999

DOI: 10.35940/ijeat.d9057.049420

N/ACitations

84Readers

Get full text

Abstract

Data pre-processing is the process of transforming the raw data into useful dataset. Data pre-processing is one of the most important phase of any machine learning model because the quality and efficiency of any machine learning model directly depends upon the data-set, if we skip this step and design a model with data sets containing missing values then the model we have designed will not be that efficient and will be inconsistent model. This paper describes the methodology for pre-processing the data in seven sequence of steps using python powerful libraries which are open source machine learning libraries that support both supervised and unsupervised learning like pandas is a high level data manipulation tool, scikit learn which provides various tools for model fitting, data pre-processing, model selection and many other utilities. These steps include dealing with missing value, categorical values, importing data sets etc. This analysis helps in cleaning and transforming the datasets which future applied to any learning model and produce a efficient machine learning model.

Cite

CITATION STYLE

APA

Pandey, N., Patnaik, P. K., & Gupta, S. (2020). Data Pre Processing for Machine Learning Models using Python Libraries. International Journal of Engineering and Advanced Technology, 9(4), 1995–1999. https://doi.org/10.35940/ijeat.d9057.049420

Data Pre Processing for Machine Learning Models using Python Libraries

Abstract

Cite

Register to see more suggestions