Phonetics and machine learning: Hierarchical modelling of prosody in statistical speech synthesis

2Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Text-to-speech synthesis is a task that solves many realworld problems such as providing speaking and reading ability to people who lack those capabilities. It is thus viewed mainly as an engineering problem rather than a purely scientific one. Therefore many of the solutions in speech synthesis are purely practical. However, from the point of view of phonetics, the process of producing speech from text artificially is also a scientific one. Here I argue - using an example from speech prosody, namely speech melody - that phonetics is the key discipline in helping to solve what is arguably one of the most interesting problems in machine learning.

Cite

CITATION STYLE

APA

Vainio, M. (2014). Phonetics and machine learning: Hierarchical modelling of prosody in statistical speech synthesis. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8791, 37–54. https://doi.org/10.1007/978-3-319-11397-5_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free