In data analysis, induction of decision trees serves two main goals: first, induced decision trees can be used for classification/prediction of new instances, and second, they represent an easy-to-interpret model of the problem domain that can be used for explanation. The accuracy of the induced classifier is usually estimated using N-fold cross validation, whereas for explanation purposes a decision tree induced from all the available data is used. Decision tree learning is relatively non-robust: a small change in the training set may significantly change the structure of the induced decision tree. This paper presents a decision tree construction method in which the domain model is constructed by consensus clustering of N decision trees induced in N-fold cross-validation. Experimental results show that consensus decision trees are simpler than C4.5 decision trees, indicating that they may be a more stable approximation of the intended domain model than decision tree, constructed from the entire set of training instances.
CITATION STYLE
Kavšek, B., Lavrač, N., & Ferligoj, A. (2001). Consensus decision trees: Using consensus hierarchical clustering for data relabelling and reduction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2167, pp. 251–262). Springer Verlag. https://doi.org/10.1007/3-540-44795-4_22
Mendeley helps you to discover research relevant for your work.