Prosodic Processing

14Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Speech synthesis speechsynthesis systems have to generate natural-sounding speech output from text. One of the key aspects of speech is prosodic processing prosody, which must be both natural (i.e., sounding like a human) and meaningful (i.e., sounding like a human who understands the contents of the text). The computation of prosody from text can be divided into the computation of prosodic tags from text and the computation of acoustic speech features from these tags. This chapter focuses on the latter. It provides an overview of prosody in human-human communication, including the communicative functions of prosody and the acoustic correlates. Discussed next is a historical overview of the various methods that have been used for prosody generation in speech synthesis, as well as of current methods. Special attention is paid to prosody generation in unit selection synthesis methods, in which large corpora are searched for fragments of speech that match the phonemes and prosodic tags computed from text and that optimize various cost functions, and in which prosody is not modeled and speech not modified. We conclude the chapter by advocating hybrid approaches in which search capabilities of unit selection methods are combined with the speech modification methods from more-traditional approaches.

Cite

CITATION STYLE

APA

van Santen, J., Mishra, T., & Klabbers, E. (2008). Prosodic Processing. In Springer Handbooks (pp. 471–488). Springer. https://doi.org/10.1007/978-3-540-49127-9_23

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free