Segment specific concatenation cost for syllable based Bengali TTS

3Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper proposes a new method of concatenation cost calculation for enhancing the optimality in unit selection. Instead of defining same set of concatenation costs for all types of speech unit transitions, costs are defined based on the type of unit transitions. Different types of unit transitions that can occur mainly in an utterance are voiced to voiced, voiced to unvoiced and unvoiced to unvoiced transitions. Natural measure of continuity is identified for each of these transitions, and costs are defined accordingly. For voiced to voiced transitions, in addition to spectral continuity, pitch and energy continuity metrics are proposed. In case of voiced to unvoiced and unvoiced to unvoiced transitions, silence duration embedded in the unvoiced region is proposed as the continuity metric. This approach of segment specific concatenation cost calculation improves the quality of syllable based text to speech synthesis. Listening tests provide a proof on the effectiveness of proposed methodology which has clearly shown the decrease in perceptual discontinuity at joins, and improvement in the overall quality of the synthesised speech. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Narendra, N. P., & Rao, K. S. (2011). Segment specific concatenation cost for syllable based Bengali TTS. In Communications in Computer and Information Science (Vol. 168 CCIS, pp. 371–382). https://doi.org/10.1007/978-3-642-22606-9_38

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free