Joint adversarial training of speech recognition and synthesis models for many-to-one voice conversion using phonetic posteriorgrams

Yuki Saito; Kei Akuzawa; Kentaro Tachibana

Journal ArticleOPEN ACCESS

Joint adversarial training of speech recognition and synthesis models for many-to-one voice conversion using phonetic posteriorgrams

IEICE Transactions on Information and Systems (2020) E103D(9) 1978-1987

DOI: 10.1587/transinf.2019EDP7297

0Citations

9Readers

Abstract

This paper presents a method for many-to-one voice conversion using phonetic posteriorgrams (PPGs) based on an adversarial training of deep neural networks (DNNs). A conventional method for many-to-one VC can learn a mapping function from input acoustic features to target acoustic features through separately trained DNN-based speech recognition and synthesis models. However, 1) the differences among speakers observed in PPGs and 2) an over-smoothing effect of generated acoustic features degrade the converted speech quality. Our method performs a domain-adversarial training of the recognition model for reducing the PPG differences. In addition, it incorporates a generative adversarial network into the training of the synthesis model for alleviating the over-smoothing effect. Unlike the conventional method, ours jointly trains the recognition and synthesis models so that they are optimized for many-to-one VC. Experimental evaluation demonstrates that the proposed method significantly improves the converted speech quality compared with conventional VC methods.

Author supplied keywords

Cite

CITATION STYLE

APA

Saito, Y., Akuzawa, K., & Tachibana, K. (2020). Joint adversarial training of speech recognition and synthesis models for many-to-one voice conversion using phonetic posteriorgrams. IEICE Transactions on Information and Systems, E103D(9), 1978–1987. https://doi.org/10.1587/transinf.2019EDP7297

Joint adversarial training of speech recognition and synthesis models for many-to-one voice conversion using phonetic posteriorgrams

Abstract

Author supplied keywords

Cite

Register to see more suggestions