Speech segregation based on sound localization

N. Roman; D. L. Wang; G. J. Brown

Conference Proceedings

Speech segregation based on sound localization

Proceedings of the International Joint Conference on Neural Networks (2001) 4 2861-2866

DOI: 10.1121/1.1610463

26Citations

219Readers

Get full text

Abstract

We study the cocktail-party effect, which refers to the ability of a listener to attend to a single talker in the presence of adverse acoustical conditions. It has been observed that this ability improves in the presence of binaural cues. In this paper, we explore a technique for speech segregation based on sound localization cues. The auditory masking phenomenon motivates an "ideal" binary mask in which time-frequency regions that correspond to the weak signal are canceled. In our model we estimate this binary mask by observing that systematic changes of the interaural time differences and intensity differences occur as the energy ratio of the original signals is modified. The performance of our model is comparable with results obtained using the ideal binary mask and it shows a large improvement over existing pitch-based algorithms.

Cite

CITATION STYLE

APA

Roman, N., Wang, D. L., & Brown, G. J. (2001). Speech segregation based on sound localization. In Proceedings of the International Joint Conference on Neural Networks (Vol. 4, pp. 2861–2866). Institute of Electrical and Electronics Engineers Inc. https://doi.org/10.1121/1.1610463

Speech segregation based on sound localization

Abstract

Cite

Register to see more suggestions