Speaker interpolation for HMM-based speech synthesis system

43Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.

Abstract

This paper describes an approach to voice characteristics conversion for an HMM-based text-to-speech synthesis system using speaker interpolation. Although most text-to-speech synthesis systems which synthesize speech by concatenating speech units can synthesize speech with acceptable quality, they still cannot synthesize speech with various voice quality such as speaker individualities and emotions ; In order to control speaker individualities and emotions, therefore, they need a large database, which records speech units with various voice characteristics in synthesis phase. On the other hand, our system synthesize speech with untrained speaker's voice quality by interpolating HMM parameters among some representative speakers' HMM sets. Accordingly, our system can synthesize speech with various voice quality without large database in synthesis phase. An HMM interpolation technique is derived from a probabilistic similarity measure for HMMs, and used to synthesize speech with untrained speaker's voice quality by interpolating HMM parameters among some representative speakers' HMM sets. The results of subjective experiments show that we can gradually change the voice quality of synthesized speech from one's to the other's by changing the interpolation ratio.

Cited by Powered by Scopus

Speech synthesis with various emotional expressions and speaking styles by style interpolation and morphing

85Citations
N/AReaders
Get full text

A postfilter to modify the modulation spectrum in HMM-based speech synthesis

57Citations
N/AReaders
Get full text

Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis

36Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Yoshimura, T., Tokuda, K., Masuko, T., Kobayashi, T., & Kitamura, T. (2000). Speaker interpolation for HMM-based speech synthesis system. Acoustical Science and Technology, 21(4), 199–205. https://doi.org/10.1250/ast.21.199

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 10

77%

Professor / Associate Prof. 2

15%

Researcher 1

8%

Readers' Discipline

Tooltip

Engineering 7

54%

Computer Science 5

38%

Linguistics 1

8%

Save time finding and organizing research with Mendeley

Sign up for free