Deep learning approach to automated segmentation of tongue in camera images for computer-aided speech diagnosis

2Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper describes an approach for automated segmentation of tongue in camera images for computer-aided speech diagnosis and therapy. Speech disorders are often related to non-normative position of articulators. One of common pathologies in Polish pronunciation is interdentality, when the tongue protrudes between the front teeth. Segmentation and possible parametrization of tongue in camera images could support speech diagnosis. Presented system is based on images captured by two cameras directed at speaker’s mouth at different angles on the left and right side. A convolutional neural network was designed and trained for semantic segmentation of tongue. Three datasets of input data were examined, two taken from each camera separately and one combined from both cameras. The mean Jaccard index reached 74.01% over the combined dataset with the corresponding accuracy at 96.09%.

Cite

CITATION STYLE

APA

Sage, A., Miodońska, Z., Kręcichwost, M., Trzaskalik, J., Kwaśniok, E., & Badura, P. (2021). Deep learning approach to automated segmentation of tongue in camera images for computer-aided speech diagnosis. In Advances in Intelligent Systems and Computing (Vol. 1186, pp. 41–51). Springer. https://doi.org/10.1007/978-3-030-49666-1_4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free