Abstract
Convolutive and temporally correlated mixtures of speech are tackled with an LP-based temporal pre-whitening stage combined with the natural gradient algorithm (NGA), to essentially perform spatial separation by maximizing entropy at the output of a nonlinear function. In the past, speech sources have been parameterized by the generalized Gaussian density (GGD) model, in which the exponent parameter directly relates to the exponent of the corresponding optimal nonlinear function. In this paper, we present an adaptive, source dependent estimation of this parameter, controlled exclusively by the statistics of the output source estimates. Comparative experimental results illustrate the inherent flexibility of the proposed method, as well as an overall increase in convergence speed and separation performance over existing approaches. © Springer-Verlag 2004.
Cite
CITATION STYLE
Kokkinakis, K., & Nandi, A. K. (2004). Multichannel speech separation using adaptive parameterization of source PDFs. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3195, 486–493. https://doi.org/10.1007/978-3-540-30110-3_62
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.