Statistical multi-stream modeling of real-time MRI articulatory speech data

  • Bresch E
  • Katsamanis N
  • Goldstein L
 et al. 
  • 20

    Readers

    Mendeley users who have this article in their library.
  • 12

    Citations

    Citations of this article.

Abstract

This paper investigates different statistical modeling frameworks for articulatory speech data obtained using real-time (RT) magnetic resonance imaging (MRI). To quantitatively capture the spatio-temporal shaping process of the human vocal tract during speech production a multi-dimensional stream of direct image features is extracted automatically from the MRI recordings. The features are closely related, though not identical, to the tract variables commonly defined in the articulatory phonology theory. The modeling of the shaping process aims at decomposing the articulatory data streams into primitives by segmentation. A variety of approaches are investigated for carrying out the segmentation task including vector quantizers, Gaussian Mixture Models, Hidden Markov Models, and a coupled Hidden Markov Model. We evaluate the performance of the different segmentation schemes qualitatively with the help of a well understood data set which was used in an earlier study of inter-articulatory timing phenomena of American English nasal sounds. © 2010 ISCA.

Author-supplied keywords

  • Articulatory modeling
  • Realtime magnetic resonance imaging
  • Speech production

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • Erik Bresch

  • Nassos Katsamanis

  • Louis Goldstein

  • Shrikhanth Narayanan

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free