A Study on Online Source Extraction in the Presence of Changing Speaker Positions

Jens Heitkaemper; Thomas Fehér; Michael Freitag; Reinhold Haeb-Umbach

Conference Proceedings

A Study on Online Source Extraction in the Presence of Changing Speaker Positions

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11816 LNAI 198-209

DOI: 10.1007/978-3-030-31372-2_17

1Citations

1Readers

Get full text

Abstract

Multi-talker speech and moving speakers still pose a significant challenge to automatic speech recognition systems. Assuming an enrollment utterance of the target speakeris available, the so-called SpeakerBeam concept has been recently proposed to extract the target speaker from a speech mixture. If multi-channel input is available, spatial properties of the speaker can be exploited to support the source extraction. In this contribution we investigate different approaches to exploit such spatial information. In particular, we are interested in the question, how useful this information is if the target speaker changes his/her position. To this end, we present a SpeakerBeam-based source extraction network that is adapted to work on moving speakers by recursively updating the beamformer coefficients. Experimental results are presented on two data sets, one with artificially created room impulse responses, and one with real room impulse responses and noise recorded in a conference room. Interestingly, spatial features turn out to be advantageous even if the speaker position changes.

Author supplied keywords

Cite

CITATION STYLE

APA

Heitkaemper, J., Fehér, T., Freitag, M., & Haeb-Umbach, R. (2019). A Study on Online Source Extraction in the Presence of Changing Speaker Positions. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11816 LNAI, pp. 198–209). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-31372-2_17

A Study on Online Source Extraction in the Presence of Changing Speaker Positions

Abstract

Author supplied keywords

Cite

Register to see more suggestions