Speaker identification based on log area ratio and Gaussian mixture models in narrow-band speech speech understanding / interaction

14Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Log area ratio coefficients (LAR) derived from linear prediction coefficients (LPC) is a well known feature extraction technique used in speech applications. This paper presents a novel way to use the LAR feature in a speaker identification system. Here, instead of using the mel frequency cepstral coefficients (MFCC), the LAR feature is used in a Gaussian mixture model (GMM) based speaker identification system. An F-ratio feature analysis was conducted on both the LAR and MFCC feature vectors which showed the lower order LAR coefficients are superior to MFCC counterpart. The text-independent, closed-set speaker identification rate, as tested on the down-sampled version of TIMIT database, was improved from 96.73%, using the MFCC feature, to 98.81%, using the LAR features. © Springer-Verlag Berlin Heidelberg 2004.

Cite

CITATION STYLE

APA

Chow, D., & Abdulla, W. H. (2004). Speaker identification based on log area ratio and Gaussian mixture models in narrow-band speech speech understanding / interaction. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3157, pp. 901–908). Springer Verlag. https://doi.org/10.1007/978-3-540-28633-2_95

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free