Real-time sound source localization and separation based on active audio-visual integration

7Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Robot audition in the real world should cope with environment noises and reverberation and motor noises caused by the robot's own movements. This paper presents the active direction-pass filter (ADPF) to separate sounds originating from the specified direction with a pair of microphones. The ADPF is implemented by hierarchical integration of visual and auditory processing with hypothetical reasoning on interaural phase difference (IPD) and interaural intensity difference (IID) for each subband. In creating hypotheses, the reference data of IPD and IID is calculated by the auditory epipolar geometry on demand. Since the performance of the ADPF depends on the direction, the ADPF controls the direction by motor movement. The human tracking and sound source separation based on the ADPF is implemented on an upper-torso humanoid and runs in real-time with 4 PCs connected over Gigabit ethernet. The signal-to-noise ratio (SNR) of each sound separated by the ADPF from a mixture of two speeches with the same loudness is improved to about 10 dB from 0 dB. © Springer-Verlag Berlin Heidelberg 2003.

Cite

CITATION STYLE

APA

Okuno, H. G., & Nakadai, K. (2003). Real-time sound source localization and separation based on active audio-visual integration. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2686, 118–125. https://doi.org/10.1007/3-540-44868-3_16

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free