Text-Independent Speaker Identification Using the Histogram Transform Model

24Citations
Citations of this article
28Readers
Mendeley users who have this article in their library.

Abstract

In this paper, we propose a novel probabilistic method for the task of text-independent speaker identification (SI). In order to capture the dynamic information during SI, we design super-mel-frequency cepstral coefficients (MFCCs) features by cascading three neighboring MFCCs frames together. These super-MFCC vectors are utilized for probabilistic model training such that the speaker's characteristics can be sufficiently captured. The probability density function (PDF) of the aforementioned super-MFCCs features is estimated by the recently proposed histogram transform (HT) method. To recede the commonly occurred discontinuity problem in multivariate histograms computing, more training data are generated by the HT method. Using these generated data, a smooth PDF of the super-MFCCs vectors is obtained. Compared with the typical PDF estimation methods, such as Gaussian mixture model, promising improvements have been obtained by employing the HT-based model in SI.

Cite

CITATION STYLE

APA

Ma, Z., Yu, H., Tan, Z. H., & Guo, J. (2016). Text-Independent Speaker Identification Using the Histogram Transform Model. IEEE Access, 4, 9733–9739. https://doi.org/10.1109/ACCESS.2016.2646458

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free