Tree-based iterative input variable selection for hydrological modeling

S. Galelli; A. Castelletti

Journal ArticleOPEN ACCESS

Tree-based iterative input variable selection for hydrological modeling

Water Resources Research (2013) 49(7) 4295-4310

DOI: 10.1002/wrcr.20339

108Citations

99Readers

Abstract

Input variable selection is an important issue associated with the development of several hydrological applications. Determining the optimal input vector from a large set of candidates to characterize a preselected output might result in a more accurate, parsimonious, and, possibly, physically interpretable model of the natural process. In the hydrological context, the modeled system often exhibits nonlinear dynamics and multiple interrelated variables. Moreover, the number of candidate inputs can be very large and redundant, especially when the model reproduces the spatial variability of the physical process. The ideal input selection algorithm should therefore provide modeling flexibility, computational efficiency in dealing with high dimension data set, scalability with respect to input dimensionality and minimum redundancy. In this paper, we propose the tree-based iterative input variable selection algorithm, a novel hybrid model-based/model-free approach specifically designed to fulfill these four requirements. The algorithm structure provides robustness against redundancy, while the tree-based nature of the underlying model ensures the other key properties. The approach is first tested on a well-known benchmark case study to validate its accuracy and subsequently applied to a real-world streamflow prediction problem in the upper Ticino River Basin (Switzerland). Results indicate that the algorithm is capable of selecting the most significant and nonredundant inputs in different testing conditions, including the real-world large data set characterized by the presence of several redundant variables. This permits one to identify a compact representation of the observational data set, which is key to improving the model performance and assisting with the interpretation of the underlying physical processes. ©2013. American Geophysical Union. All Rights Reserved.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Galelli, S., & Castelletti, A. (2013). Tree-based iterative input variable selection for hydrological modeling. Water Resources Research, 49(7), 4295–4310. https://doi.org/10.1002/wrcr.20339

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 55

73%

Professor / Associate Prof. 8

11%

Researcher 7

Lecturer / Post doc 5

Readers' Discipline

Engineering 36

51%

Environmental Science 18

26%

Earth and Planetary Sciences 9

13%

Computer Science 7

10%

Tree-based iterative input variable selection for hydrological modeling

Abstract

Author supplied keywords

References Powered by Scopus

Random forests

Bagging predictors

Feature selection based on mutual information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy

Cited by Powered by Scopus

Evolutionary algorithms and other metaheuristics in water resources: Current status, research challenges and future directions

A brief review of random forests for water scientists and practitioners and their recent history in water resources

Data-driven input variable selection for rainfall-runoff modeling using binary-coded particle swarm optimization and Extreme Learning Machines

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline