Voice conversion between synthesized bilingual voices using line spectral frequencies

2Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Voice conversion is a technique that transforms the source speaker individuality to that of the target speaker. We propose the simple and intuitive voice conversion algorithm not using training data between different languages and it uses text-to-speech generated speech rather than recorded real voices. The suggested method reconstructed the voice after transforming line spectral frequencies (LSF) by formant space warping functions. The formant space is the space consisted of representative four monophthongs for each language. The warping functions are represented by piecewise linear equations using pairs of four formants at matched monophthongs. In this paper, we applied LSF to voice conversion because LSF are not overly sensitive to quantization noise and can be interpolated. From experimental results, LSF based voice conversion shows good results for ABX and MOS tests than the direct frequency warping approaches.

Cite

CITATION STYLE

APA

Yun, Y. S., Jung, J., & Eun, S. (2015). Voice conversion between synthesized bilingual voices using line spectral frequencies. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9319, pp. 463–471). Springer Verlag. https://doi.org/10.1007/978-3-319-23132-7_57

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free