Strategy to managing mixed datasets with missing items

2Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The paper refers to the problem of decision making and choosing appropriate ways for decreasing the level of input information uncertainty related to absence or unavailability some values of mixed data sets. Approaches to addressing missing data and evaluating their performance are discussed. The generalized strategy to managing data with missing values is proposed. The study based on real pregnancy-related records of 186 patients from 12 to 42 weeks of gestation. Three missing data techniques: complete ignoring, case deletion, and random forest (RF) missing data imputation were applied to the medical data of various types, under a missing completely at random assumption for solving classification task and softening the negative impact of input information uncertainty. The efficiency of approaches to deal with missingness was evaluated. Results demonstrated that case deletion and ignoring missing values were the less suitable to handle mixed types of missing data and suggested RF imputation as a useful approach for imputing complex pregnancyrelated data sets with missing data.

Cite

CITATION STYLE

APA

Skarga-Bandurova, I., Biloborodova, T., & Dyachenko, Y. (2018). Strategy to managing mixed datasets with missing items. In Communications in Computer and Information Science (Vol. 854, pp. 608–620). Springer Verlag. https://doi.org/10.1007/978-3-319-91476-3_50

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free