Video-guided sound source separation

Junfeng Zhou; Feng Wang; Di Guo; Huaping Liu; Fuchun Sun

Conference Proceedings

Video-guided sound source separation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11740 LNAI 415-426

DOI: 10.1007/978-3-030-27526-6_36

1Citations

3Readers

Get full text

Abstract

A major aim of separating sound source is to separate the sound of interest out of mixture, such as the sound of objects on the screen. In this paper we put forward a method incorporating sound-indicated object detection and using the detection result to separate the on screen sounds and the off screen ones. After training, the object detection network could recognize which object is sounding just like human learns what object making what sound. And then using the temporal information of sounds in a video segment, we separate out sound of the object that is not shown in the video. At last, experiments are carried out in data from AudioSet and we demonstrate that the method works well in given scenarios.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhou, J., Wang, F., Guo, D., Liu, H., & Sun, F. (2019). Video-guided sound source separation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11740 LNAI, pp. 415–426). Springer Verlag. https://doi.org/10.1007/978-3-030-27526-6_36

Video-guided sound source separation

Abstract

Author supplied keywords

Cite

Register to see more suggestions