Separation of Data Cleansing Concept from EDA

  • Purohit K
N/ACitations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

Available dataset whether it is structured, semi structured or unstructured data, is used for various purposes. These data sets are mostly used for solving an issue using different kinds of techniques like visualization, descriptive, algorithms etc. This data process includes many levels, two of those steps are exploratory data analysis (EDA) and data cleansing. Data cleansing and exploratory data analysis (EDA) are two major operations of any data mining or machine learning study. After collecting the data from various sources, Data cleansing is done to make the data set more accurate, useful and less redundant. Data cleansing is useful to get the accurate information from the dataset and It is used to deal with null values, duplicate values, multiple values, inconsistent value, inaccurate value etc, Which are existing in that data set and It can make our data set filled with error which also affects the analysis and decision making process. By performing data cleansing, we can get rid of many types of misleadings like getting inaccurate output, inaccurate model of machine learning, not getting the hidden patterns behind that data set etc. The purpose of this paper is to study existing research of Data cleansing and EDA and state why Data cleansing process is not part of exploratory data analysis (EDA)

Cite

CITATION STYLE

APA

Purohit, K. (2021). Separation of Data Cleansing Concept from EDA. International Journal of Data Science and Analysis, 7(3), 89. https://doi.org/10.11648/j.ijdsa.20210703.16

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free