Prosodic break prediction with RNNs

N/ACitations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Prosodic breaks prediction from text is a fundamental task to obtain naturalness in text to speech applications. In this work we build a data-driven break predictor out of linguistic features like the Part of Speech (POS) tags and forward-backward word distance to punctuation marks, and to do so we use a basic Recurrent Neural Network (RNN) model to exploit the sequence dependency in decisions. In the experiments we evaluate the performance of a logistic regression model and the recurrent one. The results show that the logistic regression outperforms the baseline (CART) by a 9.5% in the F-score, and the addition of the recurrent layer in the model further improves the predictions of the baseline by an 11%.

Cite

CITATION STYLE

APA

Pascual, S., & Bonafonte, A. (2016). Prosodic break prediction with RNNs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10077 LNAI, pp. 64–72). Springer Verlag. https://doi.org/10.1007/978-3-319-49169-1_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free