Imputation for categorical attributes with probabilistic reasoning

Lian Jin; Hongzhi Wang; Hong Gao

Conference Proceedings

Imputation for categorical attributes with probabilistic reasoning

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 7923 LNCS 87-98

DOI: 10.1007/978-3-642-38562-9_9

1Citations

1Readers

Get full text

Abstract

Since incompleteness affects the data usage, missing values in database should be estimated to make data mining and analysis more accurate. In addition to ignoring or setting to default values, many imputation methods have been proposed, but all of them have their limitations. This paper proposes a probabilistic method to estimate missing values. We construct a Bayesian network in a novel way to identify the dependencies in a dataset, then use the Bayesian reasoning process to find the most probable substitution for each missing value. The benefits of this method include (1) irrelevant attributes can be ignored during estimation; (2) network is built with no target attribute, which means all attributes are handled in one model;(3) probability information can be obtained to measure the accuracy of the imputation. Experimental results show that our construction algorithm is effective and the quality of filled values outperforms the mode imputation method and kNN method. We also verify the effectiveness of the probabilities given by our method experimentally. © 2013 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Jin, L., Wang, H., & Gao, H. (2013). Imputation for categorical attributes with probabilistic reasoning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7923 LNCS, pp. 87–98). Springer Verlag. https://doi.org/10.1007/978-3-642-38562-9_9

Imputation for categorical attributes with probabilistic reasoning

Abstract

Author supplied keywords

Cite

Register to see more suggestions