Evaluating and correcting phoneme segmentation for unit selection synthesis

John Kominek; Christina Bennett; Alan W. Black

Conference Proceedings

Evaluating and correcting phoneme segmentation for unit selection synthesis

EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology (2003) 313-316

DOI: 10.21437/eurospeech.2003-127

41Citations

20Readers

Get full text

Abstract

As part of improved support for building unit selection voices, the Festival speech synthesis system now includes two algorithms for automatic labeling of wavefile data. The two methods are based on dynamic time warping and HMM-based acoustic modeling. Our experiments show that DTW is more accurate 70% of the time, but is also more prone to gross labeling errors. HMM modeling exhibits a systematic bias of 15 ms. Combining both methods directs human labelers towards data most likely to be problematic.

Cite

CITATION STYLE

APA

Kominek, J., Bennett, C., & Black, A. W. (2003). Evaluating and correcting phoneme segmentation for unit selection synthesis. In EUROSPEECH 2003 - 8th European Conference on Speech Communication and Technology (pp. 313–316). International Speech Communication Association. https://doi.org/10.21437/eurospeech.2003-127

Evaluating and correcting phoneme segmentation for unit selection synthesis

Abstract

Cite

Register to see more suggestions