SAT-based Decision Tree Learning for Large Data Sets

8Citations
Citations of this article
31Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Decision trees of low depth are beneficial for understanding and interpreting the data they represent. Unfortunately, finding a decision tree of lowest complexity (depth or size) that correctly represents given data is NP-hard. Hence known algorithms either (i) utilize heuristics that do not minimize the depth or (ii) are exact but scale only to small or medium-sized instances. We propose a new hybrid approach to decision tree learning, combining heuristic and exact methods in a novel way. More specifically, we employ SAT encodings repeatedly to local parts of a decision tree provided by a standard heuristic, leading to an overall reduction in complexity. This allows us to scale the power of exact SAT-based methods to comparatively very large data sets. We evaluate our new approach experimentally on a range of real-world instances that contain up to several thousand samples. In almost all cases, our method successfully decreases the complexity of the initial decision tree; often, the decrease is significant.

Cite

CITATION STYLE

APA

Schidler, A., & Szeider, S. (2024). SAT-based Decision Tree Learning for Large Data Sets. Journal of Artificial Intelligence Research, 80, 875–918. https://doi.org/10.1613/jair.1.15956

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free