Speaker Identification(SI) has numerous applications in real world. Traditional classifiers like Gaussian Mixture Models (GMM), Support Vector Machine (SVM), and Hidden Markov Models (HMM) were used earlier for SI. Features like Mel Frequency Cepstral Coefficient (MFCC), and Gammatone Frequency Cepstral Coefficients (GFCC) need to be generated first. But these approaches do not perform well when audio data captured through multiple devices and recorded in different environments, i.e., in mismatch condition. Whereas Machine Learning (ML) algorithms usually provide better accuracy, and hence became more popular. Restricted Boltzmann Machine(RBM), Long-Short-Term Memory (LSTM), and Convolutional neural network (CNN) are some of the ML approaches applied on SI. In this paper, CNN is used for automatic feature extraction and speaker classification on IITG-MV noisy dataset. CNN performs better than GMM, specially for device mismatch case.
CITATION STYLE
Chakraborty, T., Barai, B., Chatterjee, B., Das, N., Basu, S., & Nasipuri, M. (2020). Closed-Set Device-Independent Speaker Identification Using CNN. In Advances in Intelligent Systems and Computing (Vol. 1034, pp. 291–299). Springer. https://doi.org/10.1007/978-981-15-1084-7_28
Mendeley helps you to discover research relevant for your work.