Noise and speech estimation as auxiliary tasks for robust speech recognition

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Dealing with noise deteriorating the speech is still a major problem for automatic speech recognition. An interesting approach to tackle this problem consists of using multi-task learning. In this case, an efficient auxiliary task is clean-speech generation. This auxiliary task is trained in addition to the main speech recognition task and its goal is to help improve the results of the main task. In this paper, we investigate this idea further by generating features extracted directly from the audio file containing only the noise, instead of the clean-speech. After demonstrating that an improvement can be obtained through this multi-task learning auxiliary task, we also show that using both noise and clean-speech estimation auxiliary tasks leads to a 4% relative word error rate improvement in comparison to the classic single-task learning on the CHiME4 dataset.

Cite

CITATION STYLE

APA

Pironkov, G., Dupont, S., Wood, S. U. N., & Dutoit, T. (2017). Noise and speech estimation as auxiliary tasks for robust speech recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10583 LNAI, pp. 181–192). Springer Verlag. https://doi.org/10.1007/978-3-319-68456-7_15

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free