Abstract
Class imbalance (CI) is a well-known problem in data science. Nowadays, it is affecting the data modeling of many of the real-world processes that are being digitized. The manufacturing industry turns out to be highly affected by this problem, especially in fault inspection, prediction or monitoring processes, and in all those processes where the production efficiency is high and the data samples of anomalous events are rare. In this work, we systematically review all the data manipulation, machine learning or deep learning solutions to the CI problem in the manufacturing domain. We also critically evaluate all the different metrics that researchers can compare in order to estimate the improvements carried by their proposed solutions, and we look at the availability of public source code and data-imbalanced datasets that can be used for benchmarking. Finally, we summarize the most applied solutions to the CI problem in manufacturing and we look at future challenges. While posing a reference for the best practices at the time of this review, we challenge researchers to standardize the use of data science algorithms for CI in the manufacturing domain.
Author supplied keywords
Cite
CITATION STYLE
de Giorgio, A., Cola, G., & Wang, L. (2023, December 1). Systematic review of class imbalance problems in manufacturing. Journal of Manufacturing Systems. Elsevier B.V. https://doi.org/10.1016/j.jmsy.2023.10.014
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.