Abstract
We present an approach to pronunciation modeling in which the evolution of multiple linguistic feature streams is explicitly represented. This differs from phone-based models in that pronunciation variation is viewed as the result of feature asynchrony and changes in feature values, rather than phone substitutions, insertions, and deletions. We have implemented a flexible feature-based pronunciation model using dynamic Bayesian networks. In this paper, we describe our approach and report on a pilot experiment using phonetic transcriptions of utterances from the Switchboard corpus. The experimental results, as well as the model’s qualitative behavior, suggest that this is a promising way of accounting for the types of pronunciation variation often seen in spontaneous speech.
Cite
CITATION STYLE
Livescu, K., & Glass, J. (2004). Feature-based pronunciation modeling for speech recognition. In HLT-NAACL 2004 - Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Short Papers (pp. 81–84). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1613984.1614005
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.