A KNN-Based Non-Parametric Conditional Independence Test for Mixed Data and Application in Causal Discovery

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Testing for Conditional Independence (CI) is a fundamental task for causal discovery but is particularly challenging in mixed discrete-continuous data. In this context, inadequate assumptions or discretization of continuous variables reduce the CI test’s statistical power, which yields incorrect learned causal structures. In this work, we present a non-parametric CI test leveraging k-nearest neighbor (kNN) methods that are adaptive to mixed discrete-continuous data. In particular, a kNN-based conditional mutual information estimator serves as the test statistic, and the p-value is calculated using a kNN-based local permutation scheme. We prove the CI test’s statistical validity and power in mixed discrete-continuous data, which yields consistency when used in constraint-based causal discovery. An extensive evaluation of synthetic and real-world data shows that the proposed CI test outperforms state-of-the-art approaches in the accuracy of CI testing and causal discovery, particularly in settings with low sample sizes.

Cite

CITATION STYLE

APA

Huegle, J., Hagedorn, C., & Schlosser, R. (2023). A KNN-Based Non-Parametric Conditional Independence Test for Mixed Data and Application in Causal Discovery. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 14169 LNAI, pp. 541–558). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-43412-9_32

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free