Analysis on the applicability of the random forest

Tongtian Zhu

Conference ProceedingsOPEN ACCESS

Analysis on the applicability of the random forest

Zhu T

Journal of Physics: Conference Series (2020) 1607(1)

DOI: 10.1088/1742-6596/1607/1/012123

73Citations

211Readers

Abstract

Random forest is a flexible algorithm with a wide range of applications and performs well on a large number of data sets. Besides, Random forest is immune to statistical assumptions as well as preprocessing burden and can handle a large data set with high dimensionality and missing values. Nevertheless, random forest struggles with high-cardinality categorical variables, unbalanced data, time series forecasting, variables interpretation, and is sensitive to hyperparameter. Thus, random forest is relatively suitable for processing high-dimensional data and data with missing variables. Besides, random forest works well with a large amount of data, which is previously unprocessed. Moreover, random forest is an appropriate method, when there are prior statistical assumptions. However, random forest is non-ideal, when processing data with endogenous temporal effects or high-cardinality categorical variables, as well as when the interpretation is the primary goal. Despite the shortcomings of the random forest, there are still some improvements that can be made. It will be more convenient for users to screen methods, if there is a rating system to give an overall score towards all alternative algorithms depending on the input data and the users' goals.

Cite

CITATION STYLE

APA

Zhu, T. (2020). Analysis on the applicability of the random forest. In Journal of Physics: Conference Series (Vol. 1607). Institute of Physics Publishing. https://doi.org/10.1088/1742-6596/1607/1/012123

Analysis on the applicability of the random forest

Abstract

Cite

Register to see more suggestions