Missing Value Imputation with MERCS: A Faster Alternative to MissForest

Elia Van Wolputte; Hendrik Blockeel

Conference Proceedings

Missing Value Imputation with MERCS: A Faster Alternative to MissForest

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12323 LNAI 502-516

DOI: 10.1007/978-3-030-61527-7_33

4Citations

7Readers

Get full text

Abstract

Fundamentally, many problems in Machine Learning are understood as some form of function approximation; given a dataset, learn a function. However, this overlooks the ubiquitous problem of missing data. E.g., if afterwards an unseen instance has missing input variables, we actually need a function with to predict its label. Strategies to deal with missing data come in three kinds: naive, probabilistic and iterative. The naive case replaces missing values with a fixed value (e.g. the mean), then uses as if nothing was ever missing. The probabilistic case has a generative model of and uses probabilistic inference to find the most likely value of, given values for any subset of. The iterative approach consists of a loop: according to some model, fill in all the missing values based on the given ones, retrain on the completed data and redo your predictions, until these converge. MissForest is a well-known realization of this idea using Random Forests. In this work, we establish the connection between MissForest and MERCS (a multi-directional generalization of Random Forests). We go on to show that under certain (realistic) conditions where the retraining step in MissForest becomes a bottleneck, MERCS (which is trained only once) offers at-par predictive performance at a fraction of the time cost.

Author supplied keywords

Cite

CITATION STYLE

APA

Van Wolputte, E., & Blockeel, H. (2020). Missing Value Imputation with MERCS: A Faster Alternative to MissForest. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12323 LNAI, pp. 502–516). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-61527-7_33

Missing Value Imputation with MERCS: A Faster Alternative to MissForest

Abstract

Author supplied keywords

Cite

Register to see more suggestions