The motive of data mining is to extract meaningful information from the large database. Because of the human errors, their high dimensionality, noisy data, and missing values, the process over dataset may degrade the performance. Therefore, the need for handling of those data in a proper way is important for improving the performance. There are many missing data handling methods available. Mean imputation is one of the methods for missing data in the dataset. This is the preprocessing operation performed before applying any machine learning algorithms. After applying mean imputation in a dataset, the decision is made either imputed mean value is good or bad. The rpart decision tree algorithm is applied on retailer dataset to handle more number of classes. From the experimental results, there is no significant difference among variables. The results of various GLM models with different were compared and analyzed to provide better performance.
CITATION STYLE
Maheswari, K., Packia Amutha Priya, P., Ramkumar, S., & Arun, M. (2020). Missing Data Handling by Mean Imputation Method and Statistical Analysis of Classification Algorithm. In EAI/Springer Innovations in Communication and Computing (pp. 137–149). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-19562-5_14
Mendeley helps you to discover research relevant for your work.