From RNNs to Transformers: Benchmarking deep learning architectures for hydrologic prediction

Jiangtao Liu; Chaopeng Shen; Fearghal O'Donncha; Yalan Song; Wei Zhi; Hylke E. Beck; Tadd Bindas; Nicholas Kraabel; Kathryn Lawson

Journal ArticleOPEN ACCESS

From RNNs to Transformers: Benchmarking deep learning architectures for hydrologic prediction

Hydrology and Earth System Sciences (2025) 29(23) 6811-6828

DOI: 10.5194/hess-29-6811-2025

4Citations

45Readers

Abstract

Recurrent Neural Networks (RNNs) such as Long Short-Term Memory (LSTM) have achieved significant success in hydrological modeling. However, the recent breakthroughs of foundation models like ChatGPT and the Segment Anything Model (SAM) in natural language processing and computer vision have raised interest in the potential of attention mechanism-based models for hydrologic predictions. In this study, we propose a deep learning framework that seamlessly integrates multi-source, multi-scale data and multi-model modules, creating an automated platform for multi-dataset benchmarking and attention-based model comparisons beyond LSTM-centered tasks. The proposed framework enables evaluation of deep learning models across diverse hydrologic prediction tasks, including regression (daily runoff, soil moisture, snow water equivalent, and dissolved oxygen prediction), forecasting (using lagged hydrologic observations combined with meteorological inputs), autoregression (forecasting based solely on historical observations), spatial cross-validation (assessing model generalization to ungauged regions), and zero-shot forecasting (prediction without task-specific training data). Specifically, we benchmarked 11 Transformer-based architectures against a baseline Long Short-Term Memory (LSTM) model and further evaluated pretrained Large Language Models (LLMs) and Time Series Attention Models (TSAMs) regarding their capabilities for zero-shot hydrologic forecasting. Results show that LSTM models perform best in regression tasks, especially on the global streamflow dataset (median KGE = 0.75), surpassing the best-performing Transformer-based model's KGE value by 0.11. However, as tasks become more complex (from regression and forecasting to autoregression and zero-shot prediction), attention-based models gradually surpass LSTM models. This study provides a robust framework for comparing and developing different model structures in the era of large-scale models, providing a valuable benchmark for water resource modeling, forecasting, and management.

Cite

CITATION STYLE

APA

Liu, J., Shen, C., O’Donncha, F., Song, Y., Zhi, W., Beck, H. E., … Lawson, K. (2025). From RNNs to Transformers: Benchmarking deep learning architectures for hydrologic prediction. Hydrology and Earth System Sciences, 29(23), 6811–6828. https://doi.org/10.5194/hess-29-6811-2025

From RNNs to Transformers: Benchmarking deep learning architectures for hydrologic prediction

Abstract

Cite

Register to see more suggestions