Investigating the Impact of Train / Test Split Ratio on the Performance of Pre-Trained Models with Custom Datasets

Houda Bichri; Adil Chergui; Mustapha Hain

Journal ArticleOPEN ACCESS

Investigating the Impact of Train / Test Split Ratio on the Performance of Pre-Trained Models with Custom Datasets

International Journal of Advanced Computer Science and Applications (2024) 15(2) 331-339

DOI: 10.14569/IJACSA.2024.0150235

12Citations

90Readers

Abstract

—The proper allocation of data between training and testing is a critical factor influencing the performance of deep learning models, especially those built upon pre-trained architectures. Having the suitable training set size is an important factor for the classification model’s generalization performance. The main goal of this study is to find the appropriate training set size for three pre-trained networks using different custom datasets. For this aim, the study presented in this paper explores the effect of varying the train / test split ratio on the performance of three popular pre-trained models, namely MobileNetV2, ResNet50v2 and VGG19, with a focus on image classification task. In this work, three balanced datasets never seen by the models have been used, each containing 1000 images divided into two classes. The train / test split ratios used for this study are: 60–40, 70–30, 80–20 and 90–10. The focus was on the critical metrics of sensitivity, specificity and overall accuracy to evaluate the performance of the classifiers under the different ratios. Experimental results show that, the performance of the classifiers is affected by varying the training / testing split ratio for the three custom datasets. Moreover, with the three pre-trained models, using more than 70% of the dataset images for the training task gives better performance.

Author supplied keywords

Cite

CITATION STYLE

APA

Bichri, H., Chergui, A., & Hain, M. (2024). Investigating the Impact of Train / Test Split Ratio on the Performance of Pre-Trained Models with Custom Datasets. International Journal of Advanced Computer Science and Applications, 15(2), 331–339. https://doi.org/10.14569/IJACSA.2024.0150235

Investigating the Impact of Train / Test Split Ratio on the Performance of Pre-Trained Models with Custom Datasets

Abstract

Author supplied keywords

Cite

Register to see more suggestions