A bayes classifier-based OVFDT algorithm for massive stream data mining on big data platform

1Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Recently, online incremental data mining has become an immensely growing area of research for stream data mining. VFDT algorithm, as an excellent incremental decision tree classification algorithm, is widely used in online data mining. To optimize VFDT algorithm, a dynamic tie-breaking threshold strategy and a pre-pruning mechanism strategy are utilized to achieve the reduction of the scale of decision tree. Furthermore, Bayes classifier is applied to leaf nodes of Hoeffding decision tree, which promotes the improvement of classification accuracy. In this paper, this improved algorithm is called OVFDT (Optimized VFDT) algorithm. To improve the performance of OVFDT for massive streaming data processing, an implementation scheme of OVFDT Algorithm on MapReduce Platform is proposed in our paper. Considering the need for real-time computing, the implementation scheme on Storm Platform is designed. Three comparison experiments are designed to compare the scale, the classification accuracy and the execution time of decision tree of three algorithm generate. The simulation results reveal that compared with C4.5 and VFDT algorithm, OVFDT algorithm can effectively reduce the scale of the decision tree, achieves the improvement of classification accuracy as well.

Cite

CITATION STYLE

APA

Li, L., Li, P., Xu, H., & Chen, F. (2018). A bayes classifier-based OVFDT algorithm for massive stream data mining on big data platform. In Advances in Intelligent Systems and Computing (Vol. 611, pp. 537–546). Springer Verlag. https://doi.org/10.1007/978-3-319-61566-0_49

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free