Visual speech processing and recognition

0Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Lip reading is the ability to understand what a person is communicating using just the video information. Due to the advent of Internet and computers, it is now possible to remove human intervention from lip reading. Such automation is only feasible because of a couple of developments in the field of computer vision: availability of a large-scale dataset for training and use of neural network models. The applications to this are numerous. From dictating messages to a device in a noisy environment to improving speech recognition in the current technologies, visual speech recognition has proved to be pivotal. In this paper, the lip-reading models are based on deep neural network architectures that capture temporal data which are created for the task of speech recognition.

Author supplied keywords

Cite

CITATION STYLE

APA

Mahadevaswamy, U. B., Shashank Rao, M., Vrushab, S., Anagha, C., & Sangameshwar, V. (2021). Visual speech processing and recognition. In Advances in Intelligent Systems and Computing (Vol. 1141, pp. 481–491). Springer. https://doi.org/10.1007/978-981-15-3383-9_44

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free