Background: Air pollution is notable for its direct impact on human health. Hence, the ability to accurately predict air pollution concentrations is vital to raising public awareness of this issue and for better understanding of air quality management. Aim: Therefore, the aim of this research is to predict PM10 concentrations in Malaysia, specifically on Langkawi Island using random forest and multiple linear regression. Method: The predictive analytics were based on air pollution hourly data from 2003 until 2017. The eight parameters chosen in this study were PM10, NO2, O3, CO, SO2, Relative Humidity (RH), Temperature (T), and Wind Speed (WS). The findings revealed that PM10, SO2, NO2, CO, and O3 hourly trends at Langkawi Island were below the recommended Malaysian Ambient Air Quality Guidelines (MAAQG) standard. Multiple linear regression (MLR) and random forest (RF) were used for modelling and compared based on their prediction accuracy. Result: The values of RMSE, NAE, IA, PA and R2 for MLR were 8.0698, 0.1368, 0.8584, 0.7737 and 0.5984 respectively while the values of RMSE, NAE, IA, PA and R2 for RF were 6.674038, 0.107664, 0.911974, 0.852570 and 0.726681 correspondingly. From the results, the RF method was chosen as a better model than MLR since both; the error measures and the accuracy measures results are close to 1. Nevertheless, the PM10 models (RF and MLR) are unable to take into account the higher observed concentrations.
CITATION STYLE
Ahmad, N., Ul-Saufie, A. Z., Shaziayani, W. N., Abidin, A. W. Z., Zulazmi, N. E. S., & Harb, S. M. (2022). Evaluating the Performance of Random Forest and Multiple Linear Regression for Higher Observed PM10Concentrations. Israa University Journal of Applied Science, 6(1), 72–90. https://doi.org/10.52865/WHPM9019
Mendeley helps you to discover research relevant for your work.