Pitch correlogram clustering for fast speaker identification

9Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Gaussian mixture models (GMMs) are commonly used in text-independent speaker identification systems. However, for large speaker databases, their high computational run-time limits their use in online or real-time speaker identification situations. Two-stage identification systems, in which the database is partitioned into clusters based on some proximity criteria and only a single-cluster GMM is run in every test, have been suggested in literature to speed up the identification process. However, most clustering algorithms used have shown limited success, apparently because the clustering and GMM feature spaces used are derived from similar speech characteristics. This paper presents a new clustering approach based on the concept of a pitch correlogram that captures frame-to-frame pitch variations of a speaker rather than short-time spectral characteristics like cepstral coefficient, spectral slopes, and so forth. The effectiveness of this two-stage identification process is demonstrated on the IVIE corpus of 110 speakers. The overall system achieves a run-time advantage of 500% as well as a 10% reduction of error in overall speaker identification. © 2004 Hindawi Publishing Corporation.

References Powered by Scopus

Independent component analysis: Algorithms and applications

7296Citations
N/AReaders
Get full text

Robust Text-Independent Speaker Identification Using Gaussian Mixture Speaker Models

2478Citations
N/AReaders
Get full text

Speaker recognition: A tutorial

1270Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Estimating dominance in multi-party meetings using speaker diarization

51Citations
N/AReaders
Get full text

Acoustic classification and segmentation using modified spectral roll-off and variance-based features

50Citations
N/AReaders
Get full text

Multimodal Multi-Channel On-Line Speaker Diarization Using Sensor Fusion Through SVM

39Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Jhanwar, N., & Raina, A. K. (2004). Pitch correlogram clustering for fast speaker identification. Eurasip Journal on Applied Signal Processing, 2004(17), 2640–2649. https://doi.org/10.1155/S1110865704408026

Readers over time

‘12‘15‘18‘22‘2400.511.52

Readers' Seniority

Tooltip

Professor / Associate Prof. 1

33%

PhD / Post grad / Masters / Doc 1

33%

Researcher 1

33%

Readers' Discipline

Tooltip

Computer Science 2

50%

Biochemistry, Genetics and Molecular Bi... 1

25%

Mathematics 1

25%

Save time finding and organizing research with Mendeley

Sign up for free
0