Segment specific concatenation cost for syllable based Bengali TTS

N. P. Narendra; K. Sreenivasa Rao

Conference Proceedings

Segment specific concatenation cost for syllable based Bengali TTS

Communications in Computer and Information Science (2011) 168 CCIS 371-382

DOI: 10.1007/978-3-642-22606-9_38

3Citations

1Readers

Get full text

Abstract

This paper proposes a new method of concatenation cost calculation for enhancing the optimality in unit selection. Instead of defining same set of concatenation costs for all types of speech unit transitions, costs are defined based on the type of unit transitions. Different types of unit transitions that can occur mainly in an utterance are voiced to voiced, voiced to unvoiced and unvoiced to unvoiced transitions. Natural measure of continuity is identified for each of these transitions, and costs are defined accordingly. For voiced to voiced transitions, in addition to spectral continuity, pitch and energy continuity metrics are proposed. In case of voiced to unvoiced and unvoiced to unvoiced transitions, silence duration embedded in the unvoiced region is proposed as the continuity metric. This approach of segment specific concatenation cost calculation improves the quality of syllable based text to speech synthesis. Listening tests provide a proof on the effectiveness of proposed methodology which has clearly shown the decrease in perceptual discontinuity at joins, and improvement in the overall quality of the synthesised speech. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Narendra, N. P., & Rao, K. S. (2011). Segment specific concatenation cost for syllable based Bengali TTS. In Communications in Computer and Information Science (Vol. 168 CCIS, pp. 371–382). https://doi.org/10.1007/978-3-642-22606-9_38

Segment specific concatenation cost for syllable based Bengali TTS

Abstract

Author supplied keywords

Cite

Register to see more suggestions