Acoustic event mixing to multichannel ami data for distant speech recognition and acoustic event classification benchmarking

Sergei Astapov; Gleb Svirskiy; Aleksandr Lavrentyev; Tatyana Prisyach; Dmitriy Popov; Dmitriy Ubskiy; Vladimir Kabarov

Conference Proceedings

Acoustic event mixing to multichannel ami data for distant speech recognition and acoustic event classification benchmarking

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11658 LNAI 31-42

DOI: 10.1007/978-3-030-26061-3_4

1Citations

3Readers

Get full text

Abstract

Currently, the quality of Distant Speech Recognition (DSR) systems cannot match the quality of speech recognition on clean speech acquired by close-talking microphones. The main problems behind DSR are situated with the far field nature of data, one of which is unpredictable occurrence of acoustic events and scenes, which distort the signal’s speech component. Application of acoustic event detection and classification (AEC) in conjunction with DSR can benefit speech enhancement and improve DSR accuracy. However, no publicly available corpus for conjunctive AEC and DSR currently exists. This paper proposes a procedure of realistically mixing acoustic events and scenes with far field multi-channel recordings of the AMI meeting corpus, accounting for spatial reverberation and distinctive placement of sources of different kind. We evaluate the derived corpus for both DSR and AEC tasks and present replicative results, which can be used as a baseline for the corpus. The code for the proposed mixing procedure is made available online.

Author supplied keywords

Cite

CITATION STYLE

APA

Astapov, S., Svirskiy, G., Lavrentyev, A., Prisyach, T., Popov, D., Ubskiy, D., & Kabarov, V. (2019). Acoustic event mixing to multichannel ami data for distant speech recognition and acoustic event classification benchmarking. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11658 LNAI, pp. 31–42). Springer Verlag. https://doi.org/10.1007/978-3-030-26061-3_4

Acoustic event mixing to multichannel ami data for distant speech recognition and acoustic event classification benchmarking

Abstract

Author supplied keywords

Cite

Register to see more suggestions