Text-to-speech synthesis is a task that solves many realworld problems such as providing speaking and reading ability to people who lack those capabilities. It is thus viewed mainly as an engineering problem rather than a purely scientific one. Therefore many of the solutions in speech synthesis are purely practical. However, from the point of view of phonetics, the process of producing speech from text artificially is also a scientific one. Here I argue - using an example from speech prosody, namely speech melody - that phonetics is the key discipline in helping to solve what is arguably one of the most interesting problems in machine learning.
CITATION STYLE
Vainio, M. (2014). Phonetics and machine learning: Hierarchical modelling of prosody in statistical speech synthesis. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8791, 37–54. https://doi.org/10.1007/978-3-319-11397-5_3
Mendeley helps you to discover research relevant for your work.