Abstract
With increasing lake monitoring data, data-driven machine learning (ML) models might be able to capture the complex algal bloom dynamics that cannot be completely described in process-based (PB) models. We applied two ML models, the gradient boost regressor (GBR) and long short-term memory (LSTM) network, to predict algal blooms and seasonal changes in algal chlorophyll concentrations (Chl) in a mesotrophic lake. Three predictive workflows were tested, one based solely on available measurements and the others applying a two-step approach, first estimating lake nutrients that have limited observations and then predicting Chl using observed and pre-generated environmental factors. The third workflow was developed using hydrodynamic data derived from a PB model as additional training features in the two-step ML approach. The performance of the ML models was superior to a PB model in predicting nutrients and Chl. The hybrid model further improved the prediction of the timing and magnitude of algal blooms. A data sparsity test based on shuffling the order of training and testing years showed the accuracy of ML models decreased with increasing sample interval, and model performance varied with training-testing year combinations.
Cite
CITATION STYLE
Lin, S., Pierson, D. C., & Mesman, J. P. (2023). Prediction of algal blooms via data-driven machine learning models: an evaluation using data from a well-monitored mesotrophic lake. Geoscientific Model Development, 16(1), 35–46. https://doi.org/10.5194/gmd-16-35-2023
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.