Visual speech processing and recognition

U. B. Mahadevaswamy; M. Shashank Rao; S. Vrushab; C. Anagha; V. Sangameshwar

Conference Proceedings

Visual speech processing and recognition

Advances in Intelligent Systems and Computing (2021) 1141 481-491

DOI: 10.1007/978-981-15-3383-9_44

0Citations

8Readers

Get full text

Abstract

Lip reading is the ability to understand what a person is communicating using just the video information. Due to the advent of Internet and computers, it is now possible to remove human intervention from lip reading. Such automation is only feasible because of a couple of developments in the field of computer vision: availability of a large-scale dataset for training and use of neural network models. The applications to this are numerous. From dictating messages to a device in a noisy environment to improving speech recognition in the current technologies, visual speech recognition has proved to be pivotal. In this paper, the lip-reading models are based on deep neural network architectures that capture temporal data which are created for the task of speech recognition.

Author supplied keywords

Cite

CITATION STYLE

APA

Mahadevaswamy, U. B., Shashank Rao, M., Vrushab, S., Anagha, C., & Sangameshwar, V. (2021). Visual speech processing and recognition. In Advances in Intelligent Systems and Computing (Vol. 1141, pp. 481–491). Springer. https://doi.org/10.1007/978-981-15-3383-9_44

Visual speech processing and recognition

Abstract

Author supplied keywords

Cite

Register to see more suggestions