An effective imputation scheme for handling missing values in the heterogeneous dataset

0Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

A high level of data quality has always been a concern for many applications based on machine learning, including clinical decision support systems, weather forecasting, traffic predictions, and many others. A very limited amount of work is devoted to exploiting the missing values for effective imputation and better prediction. This paper introduces a unique approach to predicting and imputing missing data fields in the multivariate dataset such as numerical, categorical, and unstructured. The proposed imputation method is a multi-model scheme based on the joint approach of natural language processing (NLP) encoders, machine learning-driven feature extractors, and a sequential regression imputation technique to predict missing values. The proposed system is robust and scalable without requiring extensive engineering. The validation of the model is done on the benchmarked clinical dataset of heart disease obtained from UCI. The results show that the proposed methods achieve better imputation accuracy and require significantly less time than other missing data imputation methods.

References Powered by Scopus

Effective heart disease prediction using hybrid machine learning techniques

1282Citations
N/AReaders
Get full text

The prevention and handling of the missing data

1091Citations
N/AReaders
Get full text

missMDA: A package for handling missing values in multivariate data analysis

788Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Venkatesh, S., Kumar, M. V. V., & Virupakshappa, A. D. (2023). An effective imputation scheme for handling missing values in the heterogeneous dataset. Indonesian Journal of Electrical Engineering and Computer Science, 32(1), 423–431. https://doi.org/10.11591/ijeecs.v32.i1.pp423-431

Readers' Seniority

Tooltip

Lecturer / Post doc 1

50%

PhD / Post grad / Masters / Doc 1

50%

Readers' Discipline

Tooltip

Engineering 3

60%

Computer Science 2

40%

Save time finding and organizing research with Mendeley

Sign up for free