Missing Data Handling by Mean Imputation Method and Statistical Analysis of Classification Algorithm

7Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The motive of data mining is to extract meaningful information from the large database. Because of the human errors, their high dimensionality, noisy data, and missing values, the process over dataset may degrade the performance. Therefore, the need for handling of those data in a proper way is important for improving the performance. There are many missing data handling methods available. Mean imputation is one of the methods for missing data in the dataset. This is the preprocessing operation performed before applying any machine learning algorithms. After applying mean imputation in a dataset, the decision is made either imputed mean value is good or bad. The rpart decision tree algorithm is applied on retailer dataset to handle more number of classes. From the experimental results, there is no significant difference among variables. The results of various GLM models with different were compared and analyzed to provide better performance.

Cite

CITATION STYLE

APA

Maheswari, K., Packia Amutha Priya, P., Ramkumar, S., & Arun, M. (2020). Missing Data Handling by Mean Imputation Method and Statistical Analysis of Classification Algorithm. In EAI/Springer Innovations in Communication and Computing (pp. 137–149). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-19562-5_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free