Linguistic and mixed excitation improvements on a HMM-based speech synthesis for Castilian Spanish

  • Gonzalvo X
  • Socoró J
  • Iriondo I
  • et al.
N/ACitations
Citations of this article
15Readers
Mendeley users who have this article in their library.

Abstract

Hidden Markov Models based text-to-speech (HMM-TTS) synthesis is one of the techniques for generating speech from trained statistical models where spectrum and prosody of basic speech units are modelled altogether. This paper presents the advances in our Spanish HMM-TTS and a perceptual test is conducted to compare it with an extended PSOLA-based concatenative (E-PSOLA) system. The improvements have been performed on phonetic information and contextual factors according to the Castilian Spanish language and speech generation using a mixed excitation (ME) technique. The results show the preference of the new HMM-TTS system in front of the previous system and a better MOS in comparison with a real E-PSOLA in terms of acceptability, intelligibility and stability.

Author supplied keywords

Cite

CITATION STYLE

APA

Gonzalvo, X., Socoró, J. C., Iriondo, I., Monzo, C., & Martínez Marroquín, E. (2007). Linguistic and mixed excitation improvements on a HMM-based speech synthesis for Castilian Spanish. SSW6-2007. Proceedings of the Sixth ISCATutorial and Research Workshop in Speech Synthesis, 362–367. Retrieved from http://www.isca-speech.org/archive/ssw6/ssw6_362.html

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free