Deep-learning (DL) performs well in computer-vision and medical-imaging automated decision-making applications. A bottleneck of DL stems from the large amount of labelled data required to train accurate models that generalise well. Data scarcity and imbalance are common problems in imaging applications that can lead DL models towards biased decision making. A solution to this problem is synthetic data. Synthetic data is an inexpensive substitute to real data for improved accuracy and generalisability of DL models. This survey reviews the recent methods published in relation to the creation and use of synthetic data for computer-vision and medical-imaging DL applications. The focus will be on applications that utilised synthetic data to improve DL models by either incorporating an increased diversity of data that is difficult to obtain in real life, or by reducing a bias caused by class imbalance. Computer-graphics software and generative networks are the most popular data generation techniques encountered in the literature. We highlight their suitability for typical computer-vision and medical-imaging applications, and present promising avenues for research to overcome their computational and theoretical limitations.
CITATION STYLE
Paproki, A., Salvado, O., & Fookes, C. (2024). Synthetic Data for Deep Learning in Computer Vision & Medical Imaging: A Means to Reduce Data Bias. ACM Computing Surveys, 56(11). https://doi.org/10.1145/3663759
Mendeley helps you to discover research relevant for your work.