Successful speech enhancement by convolutive blind source separation (BSS) techniques requires careful design of all aspects of the chosen separation method. The conventional strategy for system initialization in both time- and frequency-domain BSS involves a diagonal center-spike FIR filter matrix and no data preprocessing; however, this strategy may not be the best for any chosen separation algorithm. In this paper, we experimentally evaluate two different approaches for potentially-improving the performance of time-domain and frequencydomain natural gradient speech separation algorithms - prewhitening of the signal mixtures, and delay-and-sum beamforming initialization for the separation system - to determine which of the two classes of algorithms benefit most from them. Our results indicate that frequency-domain-based natural gradient BSS methods generally need geometric information about the system to obtain any reasonable separation quality. For time-domain natural gradient separation algorithms, either beamforming initialization or prewhitening improves separation performance, particularly for larger-scale problems involving three or more sources and sensors. © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Gupta, M., & Douglas, S. C. (2007). Beamforming initialization and data prewhitening in natural gradient convolutive blind source separation of speech mixtures. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4666 LNCS, pp. 462–470). Springer Verlag. https://doi.org/10.1007/978-3-540-74494-8_58
Mendeley helps you to discover research relevant for your work.