Nonlinear postprocessing for blind speech separation

Dorothea Kolossa; Reinhold Orglmeister

Journal Article

Nonlinear postprocessing for blind speech separation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2004) 3195 832-839

DOI: 10.1007/978-3-540-30110-3_105

37Citations

8Readers

Get full text

Abstract

Frequency domain ICA has been used successfully to separate the utterances of interfering speakers in convolutive environments, see e.g. [6],[7]. Improved separation results can be obtained by applying a time frequency mask to the ICA outputs. After using the direction of arrival information for permutation correction, the time frequency mask is obtained with little computational effort. The proposed postprocessing is applied in conjunction with two frequency domain ICA methods and a beamforming algorithm, which increases separation performance for reverberant, as well as for in-car speech recordings, by an average 3.8dB. By combined ICA and time frequency masking, SNR-improvements up to 15dB are obtained in the car environment. Due to its robustness to the environment and regarding the employed ICA algorithm, time frequency masking appears to be a good choice for enhancing the output of convolutive ICA algorithms at a marginal computational cost. © Springer-Verlag 2004.

Cite

CITATION STYLE

APA

Kolossa, D., & Orglmeister, R. (2004). Nonlinear postprocessing for blind speech separation. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3195, 832–839. https://doi.org/10.1007/978-3-540-30110-3_105

Nonlinear postprocessing for blind speech separation

Abstract

Cite

Register to see more suggestions