Estimating Number of Speakers via Density-Based Clustering and Classification Decision

Junjie Yang; Yi Guo; Zuyuan Yang; Liu Yang; Shengli Xie

Journal ArticleOPEN ACCESS

Estimating Number of Speakers via Density-Based Clustering and Classification Decision

IEEE Access (2019) 7 176541-176551

DOI: 10.1109/ACCESS.2019.2956772

5Citations

7Readers

Abstract

It is crucial to robustly estimate the number of speakers (NoS) from the recorded audio mixtures in a reverberant environment. Some popular time-frequency (TF) methods approach this NoS estimation problem by assuming that only one of the speech components is active at each TF slot. However, this condition is violated in many scenarios where the speeches are convolved with long length of room impulse response coefficients, which causes degenerated performance of NoS estimation. To tackle this problem, a density-based clustering strategy is proposed to estimate NoS based on a local dominance assumption of speeches. Our method consists of several steps from clustering to classification of speakers with the consideration of robustness. First, the leading eigenvectors are extracted from the local covariance matrices of mixture TF components and ranked by the combination of local density and minimum distance to other leading eigenvectors with higher density. Second, a gap-based method is employed to determine the cluster centers from the ranked leading eigenvectors at each frequency bin. Third, a criterion based on averaged volume of cluster centers is proposed to select reliable clustering results at some frequency bins for the classification decision of NoS. The experiment results demonstrate that the proposed algorithm is superior to the existing methods in various reverberation cases with noise-free condition or noise condition.

Author supplied keywords

Cite

CITATION STYLE

APA

Yang, J., Guo, Y., Yang, Z., Yang, L., & Xie, S. (2019). Estimating Number of Speakers via Density-Based Clustering and Classification Decision. IEEE Access, 7, 176541–176551. https://doi.org/10.1109/ACCESS.2019.2956772

Estimating Number of Speakers via Density-Based Clustering and Classification Decision

Abstract

Author supplied keywords

Cite

Register to see more suggestions