Resynthesis of prosodic information using the cepstrum vocoder

2Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

The naturalness of synthetic speech depends on automatic extraction of prosodic features and prosody modeling. To improve the naturalness of the synthesized speech, we want to apply the concept of Analysis-by-Synthesis of prosodic information. Therefore, the accents and phrases of the speech signal were extracted using the quantitative Fujisaki model in a recognition model. In a generative model we resynthesized the speech signal using a cepstrum vocoder. The excitation signal of the vocoder are the pitch marks (PM), which were calculated from multiple levels of the accent and phrase marking algorithm. A preference test was performed to confirm the performance of the proposed method. For every speech signal four signals were resynthesized according to the calculated PM. Evaluators compared the resynthesized signals with one another. Results show that the quality of the resynthesized signal after prosodic marking is better.

Cite

CITATION STYLE

APA

Hussein, H., Strecha, G., & Hoffmann, R. (2010). Resynthesis of prosodic information using the cepstrum vocoder. In Proceedings of the International Conference on Speech Prosody. International Speech Communication Association. https://doi.org/10.21437/speechprosody.2010-102

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free