Speech recognition is one of the important research fields which is currently widely used for various applications. However, speech recognition performance is affected by the dialect of the speaker. Therefore, dialect recognition is often used as an additional feature in speech recognition. The process of recognizing dialects is not easy. Currently, Machine Learning technology is widely applied in dialect recognition. One of the challenges in the introduction of machine learning-based dialects is the imbalance of classes and overlaps in a wide variety of classification techniques. This study applies Random Forest-based oversampling technology for dialect recognition. For hyper-parameter optimization of the random forest algorithm, we apply the Grid Search method. Experiments on Speech Accent Archive data using the MFCC feature resulted in an accuracy of 0.91 and AUC of 0.95
CITATION STYLE
Azhar, M., & Pardede, H. F. (2021). Klasifikasi Dialek Pengujar Bahasa Inggris Menggunakan Random Forest. JURNAL MEDIA INFORMATIKA BUDIDARMA, 5(2), 439. https://doi.org/10.30865/mib.v5i2.2754
Mendeley helps you to discover research relevant for your work.