Exploring Data Augmentation Algorithm to Improve Genomic Prediction of Top-Ranking Cultivars

1Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

Genomic selection (GS) is a groundbreaking statistical machine learning method for advancing plant and animal breeding. Nonetheless, its practical implementation remains challenging due to numerous factors affecting its predictive performance. This research explores the potential of data augmentation to enhance prediction accuracy across entire datasets and specifically within the top 20% of the testing set. Our findings indicate that, overall, the data augmentation method (method A), when compared to the conventional model (method C) and assessed using Mean Arctangent Absolute Prediction Error (MAAPE) and normalized root mean square error (NRMSE), did not improve the prediction accuracy for the unobserved cultivars. However, significant improvements in prediction accuracy (evidenced by reduced prediction error) were observed when data augmentation was applied exclusively to the top 20% of the testing set. Specifically, reductions in MAAPE_20 and NRMSE_20 by 52.86% and 41.05%, respectively, were noted across various datasets. Further investigation is needed to refine data augmentation techniques for effective use in genomic prediction.

Cite

CITATION STYLE

APA

Montesinos-López, O. A., Sivakumar, A., Huerta Prado, G. I., Salinas-Ruiz, J., Agbona, A., Ortiz Reyes, A. E., … Crossa, J. (2024). Exploring Data Augmentation Algorithm to Improve Genomic Prediction of Top-Ranking Cultivars. Algorithms, 17(6). https://doi.org/10.3390/a17060260

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free