Benchmarking deep learning models for surface defect detection: a reproducible and statistically-rigorous approach

Darío G. Lema; Lidia Sánchez-González; Rubén Usamentiaga; F. J. delaCalle

Journal ArticleOPEN ACCESS

Benchmarking deep learning models for surface defect detection: a reproducible and statistically-rigorous approach

Journal of Intelligent Manufacturing (2025)

DOI: 10.1007/s10845-025-02672-8

2Citations

19Readers

Abstract

Automated surface defect detection has been a key research topic for many years, with deep learning-based object detection being one of the most widely used approaches. However, comparing the results of different models remains a challenge due to the use of varying dataset partitions and the stochastic nature of training, which can introduce variability in outcomes. This study highlights that improvements in performance metrics, such as average precision (AP50), do not always reflect a model’s true effectiveness, as other factors may influence these results. To address this challenge, a robust methodology is proposed, specifically designed for small datasets, which utilizes analysis of variance and Tukey’s test to ensure statistical significance. This methodology provides a reliable and reproducible framework for comparing results across models. The proposed methodology is demonstrated using the latest object detection models and the Northeastern University surface defect dataset, revealing that recent advancements do not always lead to statistically significant improvements. The source code has been made publicly available to promote reproducibility.

Author supplied keywords

Cite

CITATION STYLE

APA

Lema, D. G., Sánchez-González, L., Usamentiaga, R., & delaCalle, F. J. (2025). Benchmarking deep learning models for surface defect detection: a reproducible and statistically-rigorous approach. Journal of Intelligent Manufacturing. https://doi.org/10.1007/s10845-025-02672-8

Benchmarking deep learning models for surface defect detection: a reproducible and statistically-rigorous approach

Abstract

Author supplied keywords

Cite

Register to see more suggestions