Noise and speech estimation as auxiliary tasks for robust speech recognition

Gueorgui Pironkov; Stéphane Dupont; Sean U.N. Wood; Thierry Dutoit

Conference Proceedings

Noise and speech estimation as auxiliary tasks for robust speech recognition

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10583 LNAI 181-192

DOI: 10.1007/978-3-319-68456-7_15

1Citations

5Readers

Get full text

Abstract

Dealing with noise deteriorating the speech is still a major problem for automatic speech recognition. An interesting approach to tackle this problem consists of using multi-task learning. In this case, an efficient auxiliary task is clean-speech generation. This auxiliary task is trained in addition to the main speech recognition task and its goal is to help improve the results of the main task. In this paper, we investigate this idea further by generating features extracted directly from the audio file containing only the noise, instead of the clean-speech. After demonstrating that an improvement can be obtained through this multi-task learning auxiliary task, we also show that using both noise and clean-speech estimation auxiliary tasks leads to a 4% relative word error rate improvement in comparison to the classic single-task learning on the CHiME4 dataset.

Author supplied keywords

Cite

CITATION STYLE

APA

Pironkov, G., Dupont, S., Wood, S. U. N., & Dutoit, T. (2017). Noise and speech estimation as auxiliary tasks for robust speech recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10583 LNAI, pp. 181–192). Springer Verlag. https://doi.org/10.1007/978-3-319-68456-7_15

Noise and speech estimation as auxiliary tasks for robust speech recognition

Abstract

Author supplied keywords

Cite

Register to see more suggestions