GMM-UBM Based Modeling for Language Identification using New Feature Vectors

  • Nagesh* D
  • et al.
N/ACitations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The most of the existing LID systems based on the Gaussian Mixture model. The main requirement of the GMM based LID system is it require large amount of speech data to train the GMM model. Most of the Indian languages have the similarity because they are derived from Devanagari. Even though common phonemes exists in phoneme sets across the Indian languages, each language contain its unique phonotactic constraints imposed by the language. Any modeling technique capable of capturing all these slight variations imposed by the language is one of the important language identification cue. To model the GMM based LID system which captures above variations it require large number of mixture components.To model the large number of mixture components using Gaussian Mixture Model (GMM), the technique requires a large number of training data for each language class, which is very difficult to get for Indian languages. The main objective of GMM-UBM based LID system is it require less amount of training data to train(model) the system. In this paper, the importance of GMM-UBM modeling for language identification (LID) task for Indian languages are explored using new set of feature vectors. In GMM-UBM LID system based on the new feature vectors, the phonotactic variations imparted by different Indian languages are modeled using Gaussian Mixture model and Universal Background Model (GMM-UBM) technique. In this type of modeling, some amount of data from each class of language is pooled to create a universal background model. From this UBM model each model class is adapted. In this study, it is found that the performance of new feature vectors GMM-UBM based LID system is superior when compared to conventional new feature vectors based GMM LID system.

Cite

CITATION STYLE

APA

Nagesh*, Dr. A., & Sadanandam, Dr. M. (2020). GMM-UBM Based Modeling for Language Identification using New Feature Vectors. International Journal of Innovative Technology and Exploring Engineering, 4(9), 3034–3039. https://doi.org/10.35940/ijitee.d1919.029420

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free