Analysis and visualization of missing value patterns

Bas van Stein; Wojtek Kowalczyk; Thomas Bäck

Conference Proceedings

Analysis and visualization of missing value patterns

Communications in Computer and Information Science (2016) 611 187-198

DOI: 10.1007/978-3-319-40581-0_16

2Citations

7Readers

Get full text

Abstract

Missing values in datasets form a very relevant and often overlooked problem in many fields. Most algorithms are not able to handle missing values for training a predictive model or analyzing a dataset. For this reason, records with missing values are either rejected or repaired. However, both repairing and rejecting affects the dataset and the final results, creating bias and uncertainty. Therefore, knowledge about the nature of missing values and the underlying mechanisms behind them are of vital importance. To gain more in-depth insight into the underlying structures and patterns of missing values, the concept of Monotone Mixture Patterns is introduced and used to analyze the patterns of missing values in datasets. Several visualization methods are proposed to present the “patterns of missingness” in an informative way. Finally, an algorithm to generate missing values in datasets is provided to form the basis of a benchmarking tool. This algorithm can generate a large variety of missing value patterns for testing and comparing different algorithms that handle missing values.

Author supplied keywords

Cite

CITATION STYLE

APA

van Stein, B., Kowalczyk, W., & Bäck, T. (2016). Analysis and visualization of missing value patterns. In Communications in Computer and Information Science (Vol. 611, pp. 187–198). Springer Verlag. https://doi.org/10.1007/978-3-319-40581-0_16

Analysis and visualization of missing value patterns

Abstract

Author supplied keywords

Cite

Register to see more suggestions