Performance Evaluation of Tree Ensemble Classification Models Towards Challenges of Big Data Analytics

0Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Big Data Analytics poses challenges like effective and accurate real-time data mining, lack of suitable tools & techniques and in-memory processing problem. Tree-based ensemble methods (machine learning models) are able to perform such kind of large-scale analytical processing in combination with high-performance cluster computing (special kind of distributed computing) using parallel processing. Random Forest (forest of randomized trees, a tree ensemble) algorithm is considered for the performance evaluation, as tree model supports concurrency and all trees are grown simultaneously in it, so it is a suitable parallel approach with good accuracy, noisy & imbalance dataset handling capability and also it never overfit unlike a single tree model for large dataset. However significant notable improvement over the original approach is available, but some limitation still exists regarding performance and streaming dataset such that performance rate decreases on increasing the compute nodes due to a redundant allocation of feature subsets in the hybrid approach of task & data parallelization and inability to handle stream data. So these performance issues are identified and a problem statement is formulated with an objective to achieve the linear scalable speedup and incremental processing capability of random forest algorithm to perform predictive analytics over massive datasets in the cluster environment.

Cite

CITATION STYLE

APA

Godara, H., Govil, M. C., & Pilli, E. S. (2019). Performance Evaluation of Tree Ensemble Classification Models Towards Challenges of Big Data Analytics. In Communications in Computer and Information Science (Vol. 985, pp. 141–154). Springer Verlag. https://doi.org/10.1007/978-981-13-8300-7_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free