Machine Learning for Urban Air Quality Prediction Using Google AlphaEarth Foundations Satellite Embeddings: A Case Study of Quito, Ecuador

19Citations
Citations of this article
48Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Highlights: What are the main findings? Machine learning using Google AlphaEarth Foundations satellite embeddings in Google Earth Engine accurately predicted NO2 and SO2 concentrations in Quito (R2 = 0.71), capturing fine-scale pollution patterns at 10 m resolution. SHAP analysis revealed that only a small subset of embedding bands drives accurate predictions, demonstrating that compact, globally consistent features can explain urban air quality dynamics without handcrafted indices or auxiliary datasets. What is the implication of the main finding? Embedding-based remote sensing models provide a scalable solution for urban air quality monitoring in the Global South, overcoming sparse ground stations and persistent cloud cover. The approach supports policy-relevant applications such as hotspot detection, trend analysis, and sustainable urban planning, offering transferable methods for data-scarce cities worldwide. Many Global-South cities lack dense monitoring and suffer persistent cloud cover, hampering fine-scale trend detection. This study evaluates the potential of annual multi-sensor satellite embeddings from the AlphaEarth Foundations model in Google Earth Engine to predict and map major air pollutants in Quito, Ecuador, between 2017 and 2024. The 64-dimensional embeddings integrate Sentinel-1 radar, Sentinel-2 optical imagery, Landsat surface reflectance, ERA5-Land climate variables, GRACE terrestrial water storage, and GEDI canopy structure into a compact representation of surface and climatic conditions. Annual median concentrations of NO2, SO2, PM2.5, CO, and O3 from the Red Metropolitana de Monitoreo Atmosférico de Quito (REEMAQ) were paired with collocated embeddings and modeled using five machine learning algorithms. Support Vector Regression achieved the highest accuracy for NO2 and SO2 (R2 = 0.71 for both), capturing fine-scale spatial patterns and multi-year changes, including COVID-19 lockdown-related reductions. PM2.5 and CO were predicted with moderate accuracy, while O3 remained challenging due to its short-term photochemical and meteorological drivers and the mismatch with annual aggregation. SHAP analysis revealed that a small subset of embedding bands dominated predictions for NO2 and SO2. The approach provides a scalable and transferable framework for high-resolution urban air quality mapping in data-scarce environments, supporting long-term monitoring, hotspot detection, and evidence-based policy interventions.

Cite

CITATION STYLE

APA

Alvarez, C. I., Ulloa Vaca, C. A., & Echeverria Llumipanta, N. A. (2025). Machine Learning for Urban Air Quality Prediction Using Google AlphaEarth Foundations Satellite Embeddings: A Case Study of Quito, Ecuador. Remote Sensing, 17(20). https://doi.org/10.3390/rs17203472

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free