GEMS-GER: A machine learning benchmark dataset of long-term groundwater levels in Germany with meteorological forcings and site-specific environmental features

2Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

Abstract

We present GEMS-GER (Groundwater Levels, Environment, Meteorology, Site Properties), the first benchmark dataset specifically designed for machine learning applications in long-term groundwater level modeling in Germany. The dataset comprises 32 years of gapless weekly observations from 3207 monitoring wells, enriched with meteorological forcing variables and more than 50 site-specific static attributes. All data have undergone extensive preprocessing, including harmonization, outlier removal, and iterative imputation, to ensure high quality and suitability for machine learning applications. The wells are spatially distributed across Germany and cover diverse hydrogeological settings and aquifer types. To demonstrate the utility of the dataset, we provide three initial benchmark models: a single-well CNN model, a global LSTM model using dynamic inputs, and a global LSTM model incorporating both dynamic and static features. The best-performing model achieves satisfactory predictive performance (NSE > 0.5) for more than half (52 %) of the wells, which is generally considered a solid result in the context of groundwater-level modeling. GEMS-GER is openly available under an open-access license via Zenodo, accompanied by detailed documentation (Ohmer et al., 2025; 10.5281/zenodo.15530171). By enabling standardized and reproducible evaluation of data-driven groundwater models, the dataset offers a robust foundation for advancing machine learning research in hydrogeology.

Cite

CITATION STYLE

APA

Ohmer, M., Liesch, T., Habbel, B., Heudorfer, B., Gomez, M., Clos, P., … Broda, S. (2026). GEMS-GER: A machine learning benchmark dataset of long-term groundwater levels in Germany with meteorological forcings and site-specific environmental features. Earth System Science Data, 18(1), 77–95. https://doi.org/10.5194/essd-18-77-2026

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free