DNN-Based cepstral excitation manipulation for speech enhancement

Samy Elshamy; Tim Fingscheidt

Journal ArticleOPEN ACCESS

DNN-Based cepstral excitation manipulation for speech enhancement

IEEE/ACM Transactions on Audio Speech and Language Processing (2019) 27(11) 1803-1814

DOI: 10.1109/TASLP.2019.2933698

10Citations

11Readers

Abstract

This contribution aims at speech model-based speech enhancement by exploiting the source-filter model of human speech production. The proposed method enhances the excitation signal in the cepstral domain by making use of a deep neural network DNN. We investigate two types of target representations along with the significant effects of their normalization. The new approach exceeds the performance of a formerly introduced classical signal processing-based cepstral excitation manipulation CEM method in terms of noise attenuation by about 1.5 dB. We show that this gain also holds true when comparing serial combinations of envelope and excitation enhancement. In the important low-SNR conditions, no significant trade-off for speech component quality or speech intelligibility is induced, while allowing for substantially higher noise attenuation. In total, a traditional purely statistical state-of-the-art speech enhancement system is outperformed by more than 3 dB noise attenuation.

Author supplied keywords

Cite

CITATION STYLE

APA

Elshamy, S., & Fingscheidt, T. (2019). DNN-Based cepstral excitation manipulation for speech enhancement. IEEE/ACM Transactions on Audio Speech and Language Processing, 27(11), 1803–1814. https://doi.org/10.1109/TASLP.2019.2933698

DNN-Based cepstral excitation manipulation for speech enhancement

Abstract

Author supplied keywords

Cite

Register to see more suggestions