Accelerated Pseudo 3D Dynamic Speech MR Imaging at 3T Using Unsupervised Deep Variational Manifold Learning

2Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Magnetic resonance imaging (MRI) of vocal tract shaping and surrounding articulators during speaking is a powerful tool in several application areas such as understanding language disorder, informing treatment plans in oro-pharyngeal cancers. However, this is a challenging task due to fundamental tradeoffs between spatio-temporal resolution, organ coverage, and signal-to-noise ratio. Current volumetric vocal tract MR methods are either restricted to image during sustained sounds, or does dynamic imaging at highly compromised spatio-temporal resolutions for slowly moving articulators. In this work, we propose a novel unsupervised deep variational manifold learning approach to recover a “pseudo-3D” dynamic speech dataset from sequential acquisition of multiple 2D slices during speaking. We demonstrate “pseudo-3D” (or time aligned multi-slice 2D) dynamic imaging at a high temporal resolution of 18 ms capable of resolving vocal tract motion for arbitrary speech tasks. This approach jointly learns low-dimensional latent vectors corresponding to the image time frames and parameters of a 3D convolutional neural network based generator that generates volumes of the deforming vocal tract by minimizing a cost function which enforce: a) temporal smoothness on the latent vectors; b) l1 norm based regularization on generator weights; c) latent vectors of all the slices to have zero mean and unit variance Gaussian distribution; and d) data consistency with measured k-space v/s time data. We evaluate our proposed method using in-vivo vocal tract airway datasets from two normal volunteers producing repeated speech tasks, and compare it against state of the art 2D and 3D dynamic compressed sensing (CS) schemes in speech MRI. We finally demonstrate (for the first time) extraction of quantitative 3D vocal tract area functions from under-sampled 2D multi-slice datasets to characterize vocal tract shape changes in 3D during speech production. Code: https://github.com/rushdi-rusho/varMRI.

Cite

CITATION STYLE

APA

Rusho, R. Z., Zou, Q., Alam, W., Erattakulangara, S., Jacob, M., & Lingala, S. G. (2022). Accelerated Pseudo 3D Dynamic Speech MR Imaging at 3T Using Unsupervised Deep Variational Manifold Learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13436 LNCS, pp. 697–706). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-16446-0_66

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free