Abstract
The accurate real-time forecasting and impact factor identification of air pollutant levels are critical for effective pollution control and management. In this study, we implemented three machine learning algorithms, namely, Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Fully Connected Neural Network (FCNN), to predict PM2.5 and O3 concentrations in the Beijing–Tianjin–Hebei region from 2019 to 2023. XGBoost outperformed the other algorithms and was further utilized to predict PM2.5 and O3 concentrations and identify their controlling factors. The models could efficiently capture the spatial and temporal variations in the pollutants in the study area, and it was found that both anthropogenic sources and weather conditions can have significant impacts on air pollutant levels. PM10 and CO were significantly correlated to PM2.5 levels, which could be attributed to their similar emission sources and dispersion characteristics in air. O3 concentrations were greatly influenced by temperature and NO2 due to their significant impacts on O3 generation. This study demonstrates that XGBoost-based models are cost-effective tools for predicting PM2.5 and O3 levels and identifying their controlling factors. These findings provide valuable insights for formulating effective air pollution prevention policies.
Author supplied keywords
Cite
CITATION STYLE
Wei, C., Zhao, C., Hu, Y., & Tian, Y. (2025). Predicting the Concentration Levels of PM2.5 and O3 for Highly Urbanized Areas Based on Machine Learning Models. Sustainability (Switzerland), 17(20). https://doi.org/10.3390/su17209211
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.