Binaural scene analysis with multidimensional statistical filters

12Citations
Citations of this article
19Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The segregation of concurrent speakers and other sound sources is an important aspect in improving the performance of audio technology, such as noise reduction and automatic speech recognition, ASR, in difficult acoustic conditions. This technology is relevant for applications like hearing aids, mobile audio devices, robotics, hands-free audio communication and speech-based computer interfaces. Computational auditory-scene analysis (CASA) techniques simulate aspects of processing properties of the human perceptual system using statistical signal-processing techniques to improve inferences about the causes of audio input received by the system. This study argues that CASA is a promising approach to achieve source separation and outlines several theoretical arguments to support this hypothesis. With a focus on computational binaural scene analysis, principles of CASA techniques are reviewed. Furthermore, in an experimental approach, the applicability of a recent model of binaural interaction to improve ASR performance in multi-speaker conditions with spatially separated moving speakers is explored. The binaural model provides input to a statistical inference filter that employs a priori information on possible movements of the sources in order to track the positions of the speakers. The tracks are used to adapt a beamformer that selects a specific speaker. The output of the beamformer is subsequently used for an ASR task. Compared to the unprocessed, that is, mixed, data in a two-speaker condition, the word recognition rates obtained with the enhanced signals based on binaural information were increased from 30.8 to 88.4%, demonstrating the potential of the proposed CASA-based approach.

Cite

CITATION STYLE

APA

Spille, C., Meyer, B. T., Dietz, M., & Hohmann, V. (2013). Binaural scene analysis with multidimensional statistical filters. In The Technology of Binaural Listening (pp. 145–170). Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-37762-4_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free