Dataset Splitting Techniques Comparison For Face Classification on CCTV Images

Ade Nurhopipah; Uswatun Hasanah

Journal ArticleOPEN ACCESS

Dataset Splitting Techniques Comparison For Face Classification on CCTV Images

Nurhopipah A
Hasanah U

IJCCS (Indonesian Journal of Computing and Cybernetics Systems) (2020) 14(4) 341

DOI: 10.22146/ijccs.58092

N/ACitations

167Readers

Abstract

The performance of classification models in machine learning algorithms is influenced by many factors, one of which is dataset splitting method. To avoid overfitting, it is important to apply a suitable dataset splitting strategy. This study presents comparison of four dataset splitting techniques, namely Random Sub-sampling Validation (RSV), k-Fold Cross Validation (k-FCV), Bootstrap Validation (BV) and Moralis Lima Martin Validation (MLMV). This comparison is done in face classification on CCTV images using Convolutional Neural Network (CNN) algorithm and Support Vector Machine (SVM) algorithm. This study is also applied in two image datasets. The results of the comparison are reviewed by using model accuracy in training set, validation set and test set, also bias and variance of the model. The experiment shows that k-FCV technique has more stable performance and provide high accuracy on training set as well as good generalizations on validation set and test set. Meanwhile, data splitting using MLMV technique has lower performance than the other three techniques since it yields lower accuracy. This technique also shows higher bias and variance values and it builds overfitting models, especially when it is applied on validation set.

Cite

CITATION STYLE

APA

Nurhopipah, A., & Hasanah, U. (2020). Dataset Splitting Techniques Comparison For Face Classification on CCTV Images. IJCCS (Indonesian Journal of Computing and Cybernetics Systems), 14(4), 341. https://doi.org/10.22146/ijccs.58092

Dataset Splitting Techniques Comparison For Face Classification on CCTV Images

Abstract

Cite

Register to see more suggestions