Consensus decision trees: Using consensus hierarchical clustering for data relabelling and reduction

2Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In data analysis, induction of decision trees serves two main goals: first, induced decision trees can be used for classification/prediction of new instances, and second, they represent an easy-to-interpret model of the problem domain that can be used for explanation. The accuracy of the induced classifier is usually estimated using N-fold cross validation, whereas for explanation purposes a decision tree induced from all the available data is used. Decision tree learning is relatively non-robust: a small change in the training set may significantly change the structure of the induced decision tree. This paper presents a decision tree construction method in which the domain model is constructed by consensus clustering of N decision trees induced in N-fold cross-validation. Experimental results show that consensus decision trees are simpler than C4.5 decision trees, indicating that they may be a more stable approximation of the intended domain model than decision tree, constructed from the entire set of training instances.

Cite

CITATION STYLE

APA

Kavšek, B., Lavrač, N., & Ferligoj, A. (2001). Consensus decision trees: Using consensus hierarchical clustering for data relabelling and reduction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2167, pp. 251–262). Springer Verlag. https://doi.org/10.1007/3-540-44795-4_22

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free