High level speaker specific features modeling in automatic speaker recognition system

8Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

Spoken words convey several levels of information. At the primary level, the speech conveys words or spoken messages, but at the secondary level, the speech also reveals information about the speakers. This work is based on the high-level speaker-specific features on statistical speaker modeling techniques that express the characteristic sound of the human voice. Using Hidden Markov model (HMM), Gaussian mixture model (GMM), and Linear Discriminant Analysis (LDA) models build Automatic Speaker Recognition (ASR) system that are computational inexpensive can recognize speakers regardless of what is said. The performance of the ASR system is evaluated for clear speech to a wide range of speech quality using a standard TIMIT speech corpus. The ASR efficiency of HMM, GMM, and LDA based modeling technique are 98.8%, 99.1%, and 98.6% and Equal Error Rate (EER) is 4.5%, 4.4% and 4.55% respectively. The EER improvement of GMM modeling technique based ASR systemcompared with HMM and LDA is 4.25% and 8.51% respectively.

Cite

CITATION STYLE

APA

Singh, S., & Singh, P. (2020). High level speaker specific features modeling in automatic speaker recognition system. International Journal of Electrical and Computer Engineering, 10(2), 1859–1867. https://doi.org/10.11591/ijece.v10i2.pp1859-1867

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free