Abstract
In noisy environments, the recorded speech is distorted by the additional noise and the Lombard effect. Thus, the automatic speech recognition (ASR) performance is degraded in noisy environments. To solve this problem, noise reduction methods have been proposed as the conventional study. However, in the conventional study, the improvement of ASR performance for the Lombard effect was not discussed well enough. In the present paper, we focus on the robustly detection for Lombard effect speech (Lombard speech). This is because the ASR system can employ a suitable acoustic model by detecting the Lombard speech. We previously proposed the detection for Lombard speech based on second-order mel-frequency cepstral coefficient (2nd-order MFCC) and fundamental frequency (f0). The previously proposed method however requires longer utterances to detect Lombard speech. We therefore newly propose the detection method for Lombard speech with 2nd-order MFCC and spectral envelope in beginning of talking-speech. To detect the Lombard speech at a short time, the proposed method employs variable weights corresponding to elapsed time for 2nd-order MFCC and spectral envelope. As a result of evaluation experiments, we confirmed that the detection time was reduced from the conventional method. © 2013 Acoustical Society of America.
Cite
CITATION STYLE
Furoh, T., Fukumori, T., Nakayama, M., & Nishiura, T. (2013). Detection for Lombard speech with second-order mel-frequency cepstral coefficient and spectral envelope in beginning of talking-speech. In Proceedings of Meetings on Acoustics (Vol. 19). https://doi.org/10.1121/1.4800476
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.