Cross-Lingual Speaker Identification for Indian Languages

Amaan Rizvi; Anupam Jamatia; Dwijen Rudrapal; Kunal Chakma; Björn Gambäck

Conference Proceedings

Cross-Lingual Speaker Identification for Indian Languages

International Conference Recent Advances in Natural Language Processing, RANLP (2023) 979-987

DOI: 10.26615/978-954-452-092-2_105

3Citations

12Readers

Get full text

Abstract

The paper introduces a cross-lingual speaker identification system for Indian languages, utilising a Long Short-Term Memory dense neural network (LSTM-DNN). The system was trained on audio recordings in English and evaluated on data from Hindi, Kannada, Malayalam, Tamil, and Telugu, with a view to how factors such as phonetic similarity and native accent affect performance. The model was fed with MFCC (mel-frequency cepstral coefficient) features extracted from the audio file. For comparison, the corresponding melspectrogram images were also used as input to a ResNet-50 model, while the raw audio was used to train a Siamese network. The LSTM-DNN model outperformed the other two models as well as two more traditional baseline speaker identification models, showing that deep learning models are superior to probabilistic models for capturing low-level speech features and learning speaker characteristics.

Cite

CITATION STYLE

APA

Rizvi, A., Jamatia, A., Rudrapal, D., Chakma, K., & Gambäck, B. (2023). Cross-Lingual Speaker Identification for Indian Languages. In International Conference Recent Advances in Natural Language Processing, RANLP (pp. 979–987). Incoma Ltd. https://doi.org/10.26615/978-954-452-092-2_105

Cross-Lingual Speaker Identification for Indian Languages

Abstract

Cite

Register to see more suggestions