Statistical parametric speech synthesis with a novel codebook-based excitation model

Tamás Gábor Csapó; Géza Németh

Conference ProceedingsOPEN ACCESS

Statistical parametric speech synthesis with a novel codebook-based excitation model

Intelligent Decision Technologies (2014) 8(4) 289-299

DOI: 10.3233/IDT-140197

3Citations

5Readers

Abstract

Speech synthesis is an important modality in Cognitive Infocommunications, which is the intersection of informatics and cognitive sciences. Statistical parametric methods have gained importance in speech synthesis recently. The speech signal is decomposed to parameters and later restored from them. The decomposition is implemented by speech coders. We apply a novel codebook-based speech coding method to model the excitation of speech. In the analysis stage the speech signal is analyzed frame-by-frame and a codebook of pitch synchronous excitations is built from the voiced parts. Timing, gain and harmonic-to-noise ratio parameters are extracted and fed into the machine learning stage of Hidden Markov-model based speech synthesis. During the synthesis stage the codebook is searched for a suitable element in each voiced frame and these are concatenated to create the excitation signal, from which the final synthesized speech is created. Our initial experiments show that the model fits well in the statistical parametric speech synthesis framework and in most cases it can synthesize speech in a better quality than the traditional pulse-noise excitation. (This paper is an extended version of [10].)

Author supplied keywords

Cite

CITATION STYLE

APA

Csapó, T. G., & Németh, G. (2014). Statistical parametric speech synthesis with a novel codebook-based excitation model. In Intelligent Decision Technologies (Vol. 8, pp. 289–299). IOS Press. https://doi.org/10.3233/IDT-140197

Statistical parametric speech synthesis with a novel codebook-based excitation model

Abstract

Author supplied keywords

Cite

Register to see more suggestions