Estimating sentence final tone labels using dialogue-act information for text-to-speech synthesis within a spoken dialogue system

Nobukatsu Hojo; Yusuke Ijima; Hiroaki Sugiyama

Journal ArticleOPEN ACCESS

Estimating sentence final tone labels using dialogue-act information for text-to-speech synthesis within a spoken dialogue system

Transactions of the Japanese Society for Artificial Intelligence (2020) 35(4) 1-11

DOI: 10.1527/tjsai.A-JA5

0Citations

9Readers

Abstract

This paper proposes a novel sentence final tone labels estimation method using dialogue-act (DA) information for text-to-speech synthesis within a spoken dialogue system. Estimating appropriate sentence final tone labels is considered essential for communicating the exact system’s intentions to users by an utterance. In this paper, we propose to utilize DA features as well as the conventional features, morphological information of the utterance text, to estimate the sentence final tone labels. For this study, we use the speech database with DA tags which we constructed in our previous study. We added sentence final tone labels to this database so that each utterance has all information of utterance text, DA and sentence final tone label. Based on this database, we build the proposed sentence final tone estimation model. We evaluated the proposed method by comparing its performance with the conventional method. The evaluation results show that the proposed method overwhelms the conventional method in accuracy. We also analyze the estimation results to investigate the efficacy and difficulty by the proposed method.

Author supplied keywords

Cite

CITATION STYLE

APA

Hojo, N., Ijima, Y., & Sugiyama, H. (2020). Estimating sentence final tone labels using dialogue-act information for text-to-speech synthesis within a spoken dialogue system. Transactions of the Japanese Society for Artificial Intelligence, 35(4), 1–11. https://doi.org/10.1527/tjsai.A-JA5

Estimating sentence final tone labels using dialogue-act information for text-to-speech synthesis within a spoken dialogue system

Abstract

Author supplied keywords

Cite

Register to see more suggestions