Particulate matter PM2.5 pollution affects the Chinese population, particularly in cities such as Shenyang in northeastern China, which occupies a number of traditional heavy industries. This paper proposes a semi-supervised learning model used for predicting PM2.5 concentrations. The model incorporates rich data from the real world, including 11 air quality monitoring stations in Shenyang and nearby cities. There are three types of data: air monitoring, meteorological data, and spatiotemporal information (such as the spatiotemporal effects of PM2.5 emissions and diffusion across different geographical regions). The model consists of two classifiers: genetic programming (GP) to forecast PM2.5 concentrations and support vector classification (SVC) to predict trends. The experimental results show that the proposed model performs better than baseline models in accuracy, including 3% to 18% over a classic multivariate linear regression (MLR), 1% to 11% over a multi-layer perceptron neural network (MLP-ANN), and 21% to 68% over a support vector regression (SVR). Furthermore, the proposed GP approach provides an intuitive contribution analysis of factors for PM2.5 concentrations. The data of backtracking points adjacent to other monitoring stations are critical in forecasting shorter time intervals (1 h). Wind speeds are more important in longer intervals (6 and 24 h).
CITATION STYLE
Jiang, H., Wang, X., & Sun, C. (2022). Predicting PM2.5 in the Northeast China Heavy Industrial Zone: A Semi-Supervised Learning with Spatiotemporal Features. Atmosphere, 13(11). https://doi.org/10.3390/atmos13111744
Mendeley helps you to discover research relevant for your work.