Abstract
Accurate global estimates of accumulation-mode particle number concentrations (N100) are essential for understanding aerosol-cloud interactions and their climate effects and for improving Earth system models. However, traditional methods relying on sparse in situ measurements lack comprehensive coverage, and indirect satellite retrievals have limited sensitivity in the relevant size range. To overcome these challenges, we apply machine learning (ML) techniques - multiple linear regression (MLR) and eXtreme Gradient Boosting (XGB) - to generate daily global N100 fields using in situ measurements as target variables and reanalysis data from the Copernicus Atmosphere Monitoring Service (CAMS) and ERA5 as predictor variables. Our cross-validation showed that ML models captured N100 concentrations well in environments well-represented in the training set, with over 70 % of daily estimates being within a factor of 1.5 of observations. However, performance declines in underrepresented regions and conditions, such as in clean and remote environments, including marine, tropical, and polar regions, underscoring the need for more diverse observations. The most important predictors for N100 in the ML models were aerosol-phase sulfate and gas-phase ammonia concentrations, followed by carbon monoxide and sulfur dioxide. Although black carbon and organic matter showed the highest feature importance values, their opposing signs in the MLR model coefficients suggest that their effects largely offset each other’s contributions to the N100 estimate. By directly linking estimates to in situ measurements, our ML approach provides valuable insights into the global distribution of N100 and serves as a complementary tool for evaluating Earth system model outputs and advancing the understanding of aerosol processes and their role in the climate system.
Cite
CITATION STYLE
Ovaska, A., Rauth, E., Holmberg, D., Artaxo, P., Backman, J., Bergmans, B., … Paasonen, P. (2025). Global fields of daily accumulation-mode particle number concentrations using in situ observations, reanalysis data, and machine learning. Aerosol Research, 3(2), 589–618. https://doi.org/10.5194/ar-3-589-2025
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.