The development of music culture has resulted in a problem called optical music recognition (OMR). OMR is a task in computer vision that explores the algorithms and models to recognize musical notation. This study proposed the stacking ensemble learning model to complete the OMR task using the common western musical notation (CWMN) musical notation. The ensemble learning model used four deep convolutional neural networks (DCNNs) models, namely ResNeXt50, Inception-V3, RegNetY-400MF, and EfficientNet-V2-S as the base classifier. This study also analysed the most appropriate technique to be used as the ensemble learning model’s meta-classifier. Therefore, several machine learning techniques are determined to be evaluated, namely support vector machine (SVM), logistic regression (LR), random forest (RF), K-nearest neighbor (KNN), decision tree (DT), and Naïve Bayes (NB). Six publicly available OMR datasets are combined, down sampled, and used to test the proposed model. The dataset consists of the HOMUS_V2, Rebelo1, Rebelo2, Fornes, OpenOMR, and PrintedMusicSymbols datasets. The proposed ensemble learning model managed to outperform the model built in the previous study and succeeded in achieving outstanding accuracy and F1-scores with the best value of 97.51% and 97.52%, respectively; both of which were achieved by the LR meta-classifier.
CITATION STYLE
Ferano, F. C. A., Zahra, A., & Kusuma, G. P. (2023). Stacking ensemble learning for optical music recognition. Bulletin of Electrical Engineering and Informatics, 12(5), 3095–3104. https://doi.org/10.11591/eei.v12i5.5129
Mendeley helps you to discover research relevant for your work.