On-the-fly image-level oversampling for imbalanced datasets of manufacturing defects

1Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Visual defect recognition and its manufacturing applications have been an upcoming topic in recent AI research. Defect datasets are often severely imbalanced and can be additionally burdened with separating classes of high visual similarity. Although various methods of data augmentation have been proposed to mitigate the class imbalance, they often fail to cope with tinier minority classes or have fidelity issues with smaller defects while, at the same time, needing significant computational resources to train. Also, augmentation based on vector-based oversampling struggles to produce high-fidelity inputs and is hard to apply on custom CNN architectures, which often perform better for this type of problem. Our work presents an image-level oversampling method based on an instance-based image generator that can be applied to any CNN directly during the training process without increasing the order of training time required. It is based on identifying a small number of the most uncertain base samples close to the estimated class boundaries and using them as seeds for augmentation. The resulting images are of high visual quality preserving small class differences, and they also improve the classifier boundary leading to higher recall scores than other state-of-the-art approaches.

Cite

CITATION STYLE

APA

Theodoropoulos, S., Zajec, P., Rožanec, J. M., Kyriazis, D., & Tsanakas, P. (2024). On-the-fly image-level oversampling for imbalanced datasets of manufacturing defects. Machine Learning, 113(7), 4013–4035. https://doi.org/10.1007/s10994-023-06498-4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free