Neural Network Models of Sensory Integration for Improved Vowel Recognition

Ben P. Yuhas; Moise H. Goldstein; Terrence J. Sejnowski; Robert E. Jenkins

Journal Article

Neural Network Models of Sensory Integration for Improved Vowel Recognition

Proceedings of the IEEE (1990) 78(10) 1658-1668

DOI: 10.1109/5.58349

52Citations

37Readers

Get full text

Abstract

Automatic speech recognizers currently perform poorly in the presence of noise. Humans, on the other hand, often compensate for noise degradation by extracting speech information from alternative sources and then integrating this information with the acoustical signal. Visual signals from the speaker’s face are one source of supplemental speech information. We demonstrate that multiple sources of speech information can be integrated at a sub-symbolic level to improve vowel recognition. Feedforward and recurrent neural networks are trained to estimate the acoustic characteristics of the vocal tract from images of the speaker’s mouth. These estimates are then combined with the noise-degraded acoustic information, effectively increasing the signal-to-noise ratio and improving the recognition of these noise-degraded signals. Alternative symbolic strategies, such as direct categorization of the visual signals into vowels, are also presented. The performances of these neural networks compared favorably with human performance and with other pattern-matching and estimation techniques. © 1990, IEEE

Cite

CITATION STYLE

APA

Yuhas, B. P., Goldstein, M. H., Sejnowski, T. J., & Jenkins, R. E. (1990). Neural Network Models of Sensory Integration for Improved Vowel Recognition. Proceedings of the IEEE, 78(10), 1658–1668. https://doi.org/10.1109/5.58349

Neural Network Models of Sensory Integration for Improved Vowel Recognition

Abstract

Cite

Register to see more suggestions