—The proper allocation of data between training and testing is a critical factor influencing the performance of deep learning models, especially those built upon pre-trained architectures. Having the suitable training set size is an important factor for the classification model’s generalization performance. The main goal of this study is to find the appropriate training set size for three pre-trained networks using different custom datasets. For this aim, the study presented in this paper explores the effect of varying the train / test split ratio on the performance of three popular pre-trained models, namely MobileNetV2, ResNet50v2 and VGG19, with a focus on image classification task. In this work, three balanced datasets never seen by the models have been used, each containing 1000 images divided into two classes. The train / test split ratios used for this study are: 60–40, 70–30, 80–20 and 90–10. The focus was on the critical metrics of sensitivity, specificity and overall accuracy to evaluate the performance of the classifiers under the different ratios. Experimental results show that, the performance of the classifiers is affected by varying the training / testing split ratio for the three custom datasets. Moreover, with the three pre-trained models, using more than 70% of the dataset images for the training task gives better performance.
CITATION STYLE
Bichri, H., Chergui, A., & Hain, M. (2024). Investigating the Impact of Train / Test Split Ratio on the Performance of Pre-Trained Models with Custom Datasets. International Journal of Advanced Computer Science and Applications, 15(2), 331–339. https://doi.org/10.14569/IJACSA.2024.0150235
Mendeley helps you to discover research relevant for your work.