Feature selection for improving indian spoken language identification in utterance duration mismatch condition

Aarti Bakshi; Sunil Kumar Kopparapu

Journal ArticleOPEN ACCESS

Feature selection for improving indian spoken language identification in utterance duration mismatch condition

Bulletin of Electrical Engineering and Informatics (2021) 10(5) 2578-2587

DOI: 10.11591/eei.v10i5.3173

0Citations

6Readers

Abstract

In spoken language identification (SLID) systems, the test data may be of a sufficiently shorter duration than training data, known as duration mismatch condition. Duration normalized features are used to identify a spoken language for nine Indian languages in duration mismatch conditions. Random forest-based importance vectors of 1582 OpenSMILE features are calculated for each utterance in different duration datasets. The feature importance vectors are normalized across each dataset and later across different duration datasets. The optimal number of duration normalized features is selected to maximize SLID system accuracy. Three classifiers, artificial neural network (ANN), support vector machine (SVM), and random forest (RF), and their fusion, weights optimized using logistic regression, are used. The speech material comprised utterances, each of 30 sec, extracted from the All India Radio dataset with nine Indian languages. Seven new datasets of smaller utterance durations were generated by carefully splitting each utterance. Experimental results showed that 150 most important duration normalized features were optimal with a relative increase in 18-80% accuracy for mismatch conditions. The accuracy decreased with increased duration mismatch.

Author supplied keywords

Cite

CITATION STYLE

APA

Bakshi, A., & Kopparapu, S. K. (2021). Feature selection for improving indian spoken language identification in utterance duration mismatch condition. Bulletin of Electrical Engineering and Informatics, 10(5), 2578–2587. https://doi.org/10.11591/eei.v10i5.3173

Feature selection for improving indian spoken language identification in utterance duration mismatch condition

Abstract

Author supplied keywords

Cite

Register to see more suggestions