Voice conversion for TTS systems with tuning on the target speaker based on GMM

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The paper is devoted to improving the methods of voice conversion (VC) for developing text-to-speech synthesis systems with capabilities of tuning on the target speaker. Such system with VC module in acoustic processor, parametric representation of speech database for concatenative synthesis based on instantaneous harmonic representation is presented in the paper. Voice conversion is based on multiple regression mapping function and Gaussian mixture model (GMM), the method of text-independent learning is based on hidden Markov models and modified Viterbi algorithm. Experimental evaluation of the proposed solutions in terms of naturalness and similarity is presented as well.

Cite

CITATION STYLE

APA

Zahariev, V., Azarov, E., & Petrovsky, A. (2017). Voice conversion for TTS systems with tuning on the target speaker based on GMM. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10458 LNAI, pp. 788–798). Springer Verlag. https://doi.org/10.1007/978-3-319-66429-3_79

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free