DNN-Based cepstral excitation manipulation for speech enhancement

10Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

This contribution aims at speech model-based speech enhancement by exploiting the source-filter model of human speech production. The proposed method enhances the excitation signal in the cepstral domain by making use of a deep neural network DNN. We investigate two types of target representations along with the significant effects of their normalization. The new approach exceeds the performance of a formerly introduced classical signal processing-based cepstral excitation manipulation CEM method in terms of noise attenuation by about 1.5 dB. We show that this gain also holds true when comparing serial combinations of envelope and excitation enhancement. In the important low-SNR conditions, no significant trade-off for speech component quality or speech intelligibility is induced, while allowing for substantially higher noise attenuation. In total, a traditional purely statistical state-of-the-art speech enhancement system is outperformed by more than 3 dB noise attenuation.

Cite

CITATION STYLE

APA

Elshamy, S., & Fingscheidt, T. (2019). DNN-Based cepstral excitation manipulation for speech enhancement. IEEE/ACM Transactions on Audio Speech and Language Processing, 27(11), 1803–1814. https://doi.org/10.1109/TASLP.2019.2933698

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free