Pollutant forecasting is an important problem in the environmental sciences. Data mining is an approach to discover knowledge from large data. This paper tries to use data mining methods to forecast concentration level, which is an important air pollutant. There are several tree-based classification algorithms available in data mining, such as CART, C4.5, Random Forest (RF) and C5.0. RF and C5.0 are popular ensemble methods, which are, RF builds on CART with Bagging and C5.0 builds on C4.5 with Boosting, respectively. This paper builds concentration level predictive models based on RF and C5.0 by using R packages. The data set includes 2000-2011 period data in a new town of Hong Kong. The concentration is divided into 2 levels, the critical points is 25μg/ (24 hours mean). According to 100 times 10-fold cross validation, the best testing accuracy is from RF model, which is around 0.845~0.854.
CITATION STYLE
Zhao, Y., & Abu, Y. (2013). Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms. International Journal of Advanced Computer Science and Applications, 4(5). https://doi.org/10.14569/ijacsa.2013.040503
Mendeley helps you to discover research relevant for your work.