Research on Speech Synthesis Technology Based on Rhythm Embedding

2Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In recent years, Text-To-Speech (TTS) technology has developed rapidly. People have also been paying more attention to how to narrow the gap between synthetic speech and real speech, hoping that synthesized speech can be integrated with real rhythm. A rhythmic feature embedding method for Text-To-Speech was proposed in this thesis based on Tacotron2 model, which has arisen in the field of TTS in recent years. Firstly, rhythmic feature extraction through World vocoder can reduce redundant information in rhythmic features. Then, rhythmic feature fusion based on Variational Auto-Encoder (VAE) network can enhance rhythmic information. Experiments are carried out on the data set LJSpeech-1.0, and then subjective evaluation and objective evaluation are carried out on the synthesized speech respectively. Compared with the comparative literature, the subjective blind hearing test (ABX) score increased by 25%. At that same time, the objective Mel Cepstral Distortion value (MCD) declined to 12.77.

References Powered by Scopus

WORLD: A vocoder-based high-quality speech synthesis system for real-time applications

958Citations
N/AReaders
Get full text

Tacotron: Towards end-To-end speech synthesis

905Citations
N/AReaders
Get full text

An overview of voice conversion systems

236Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Investigation of Social Factor in Conversational Entrainments

0Citations
N/AReaders
Get full text

Dialogue scenario classification based on social factors

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Wu, T., Zhao, L., & Zhang, Q. (2020). Research on Speech Synthesis Technology Based on Rhythm Embedding. In Journal of Physics: Conference Series (Vol. 1693). IOP Publishing Ltd. https://doi.org/10.1088/1742-6596/1693/1/012127

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 1

100%

Readers' Discipline

Tooltip

Computer Science 1

100%

Save time finding and organizing research with Mendeley

Sign up for free