Abstract
Hydrological models often involve constitutive laws that may not be optimal in every application. We propose to replace such laws with the Kolmogorov‐Arnold networks (KANs), a class of neural networks designed to identify symbolic expressions. We demonstrate KAN's potential on the problem of baseflow identification, a notoriously challenging task plagued by significant uncertainty. KAN‐derived functional dependencies of the baseflow components on the aridity index outperform their original counterparts; they demonstrate that water availability, rather than potential evapotranspiration, drives baseflow by constraining actual evapotranspiration under arid conditions. On a test set, they increase the Nash‐Sutcliffe efficiency (NSE) by 65%, decrease the root mean squared error by 29%, and increase the Kling‐Gupta efficiency by 34%. This superior performance is achieved while reducing the number of fitting parameters from three to two. Next, we use data from 378 catchments across the continental United States to refine the water‐balance equation at the mean‐annual scale. The KAN‐derived equations based on the refined water balance outperform both the current aridity index model, with up to a 105% increase in NSE, and the KAN‐derived equations based on the original water balance. While the performance of our model and tree‐based machine learning methods is similar, KANs offer the advantage of simplicity and transparency and require no specific software or computational tools. This case study focuses on the aridity index formulation, but the approach is flexible and transferable to other hydrological processes.Equations used in hydrologic model are often suboptimal, resulting in reduced prediction accuracy and efficiency. We implemented Kolmogorov‐Arnold networks (KAN), a machine learning algorithm for deriving symbolic formulations, to estimate groundwater recharge and showed that it outperforms an existing state‐of‐the‐art semi‐empirical formulation. In hydrology, Nash‐Sutcliffe efficiency (NSE), root mean squared error (RMSE), and Kling‐Gupta efficiency (KGE) are commonly used to evaluate model performance. Higher NSE and KGE values indicate better performance, while lower RMSE values are preferable. Our results show that NSE increased by 71%, RMSE decreased by 32%, and KGE improved by 25%. In addition, KAN identifies an optimal functional form and can be used to derive new analytical formulas using the prior knowledge. The KAN‐inspired equation outperformed the original formulation and reduced the fitting parameters. Furthermore, we refined the water‐balance equation at the mean‐annual scale and showed that, based on the new water‐balance equation, KAN can derive new formulations that are superior to the original aridity index formulations (up to 105% increase in NSE) and KAN‐derived equations based on the original water balance. These findings highlight the significant potential of KAN to advance the scientific understanding of a wide range of hydrologic processes. Kolmogorov‐Arnold networks (KANs) enhance interpretability of machine‐learned hydrological models KAN‐derived symbolic formulations outperform state‐of‐the‐art semi‐empirical aridity indices KAN‐identified functional form yields an analytical index with fewer fitting parameters and improved performance
Cite
CITATION STYLE
Liu, C., Roy, T., Tartakovsky, D. M., & Dwivedi, D. (2025). Baseflow Identification via Explainable AI With Kolmogorov‐Arnold Networks. Journal of Geophysical Research: Machine Learning and Computation, 2(4). https://doi.org/10.1029/2025jh000749
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.