Can we improve machine-learning (ML) emulators with synthetic data? If data are scarce or expensive to source and a physical model is available, statistically generated data may be useful for augmenting training sets cheaply. Here we explore the use of copula-based models for generating synthetically augmented datasets in weather and climate by testing the method on a toy physical model of downwelling longwave radiation and corresponding neural network emulator. Results show that for copula-augmented datasets, predictions are improved by up to 62ĝ€¯% for the mean absolute error (from 1.17 to 0.44ĝ€¯Wĝ€¯m-2).
CITATION STYLE
Meyer, D., Nagler, T., & Hogan, R. J. (2021). Copula-based synthetic data augmentation for machine- learning emulators. Geoscientific Model Development, 14(8), 5205–5215. https://doi.org/10.5194/gmd-14-5205-2021
Mendeley helps you to discover research relevant for your work.