We study the problem of merging decision trees: Given k decision trees T1,T2,T3..,Tk, we merge these trees into one super tree T with (often) much smaller size. The resultant super tree T, which is an integration of k decision trees with each leaf having a major label, can also be considered as a (lossless) compression of a random forest. For any testing instance, it is guaranteed that the tree T gives the same prediction as the random forest consisting of T1,T2,T3..,Tk but it saves the computational effort needed for traversing multiple trees. The proposed method is suitable for classification problems with time constraints, for example, the online classification task such that it needs to predict a label for a new instance before the next instance arrives. Experiments on five datasets confirm that the super tree T runs significantly faster than the random forest with k trees. The merging procedure also saves space needed storing those k trees, and it makes the forest model more interpretable, since naturally one tree is easier to be interpreted than k trees.
CITATION STYLE
Fan, C., & Li, P. (2020). Classification Acceleration via Merging Decision Trees. In FODS 2020 - Proceedings of the 2020 ACM-IMS Foundations of Data Science Conference (pp. 13–22). Association for Computing Machinery, Inc. https://doi.org/10.1145/3412815.3416886
Mendeley helps you to discover research relevant for your work.