Abstract
We demonstrate that machine learning (ML) can skillfully classify thunderstorms into three categories: supercell, part of a quasi-linear convective system, or disorganized. These classifications are based on radar data and environmental information obtained through a proximity sounding. We compare the performance of five ML algorithms: logistic regression with the elastic-net penalty, random forests, gradient-boosted forests, and support-vector machines with both a linear and nonlinear kernel. The gradient-boosted forest performs best, with an accuracy of 0.77 ± 0.02 and a Peirce score of 0.58 ± 0.04. The linear support-vector machine performs second best, with values of 0.70 ± 0.02 and 0.55 ± 0.05, respectively. We use two interpretation methods, permutation importance and sequential forward selection, to determine the most important predictors for the ML models. We also use partial-dependence plots to determine how these predictors influence the outcome. A main conclusion is that shape predictors, based on the outline of the storm, appear to be highly important across ML models. The training data, a storm-centered radar scan and modeled proximity sounding, are similar to real-time data. Thus, the models could be used operationally to aid human decision-making by reducing the cognitive load involved in manual storm-mode identification. Also, they could be run on historical data to perform climatological analyses, which could be valuable to both the research and operational communities.
Author supplied keywords
Cite
CITATION STYLE
Jergensen, G. E., McGovern, A., Lagerquist, R., & Smith, T. (2020). Classifying convective storms using machine learning. Weather and Forecasting, 35(2), 537–559. https://doi.org/10.1175/WAF-D-19-0170.1
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.