Voice conversion for TTS systems with tuning on the target speaker based on GMM

Vadim Zahariev; Elias Azarov; Alexander Petrovsky

Conference Proceedings

Voice conversion for TTS systems with tuning on the target speaker based on GMM

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10458 LNAI 788-798

DOI: 10.1007/978-3-319-66429-3_79

0Citations

1Readers

Get full text

Abstract

The paper is devoted to improving the methods of voice conversion (VC) for developing text-to-speech synthesis systems with capabilities of tuning on the target speaker. Such system with VC module in acoustic processor, parametric representation of speech database for concatenative synthesis based on instantaneous harmonic representation is presented in the paper. Voice conversion is based on multiple regression mapping function and Gaussian mixture model (GMM), the method of text-independent learning is based on hidden Markov models and modified Viterbi algorithm. Experimental evaluation of the proposed solutions in terms of naturalness and similarity is presented as well.

Author supplied keywords

Cite

CITATION STYLE

APA

Zahariev, V., Azarov, E., & Petrovsky, A. (2017). Voice conversion for TTS systems with tuning on the target speaker based on GMM. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10458 LNAI, pp. 788–798). Springer Verlag. https://doi.org/10.1007/978-3-319-66429-3_79

Voice conversion for TTS systems with tuning on the target speaker based on GMM

Abstract

Author supplied keywords

Cite

Register to see more suggestions