Sentences Prediction Based on Automatic Lip-Reading Detection with Deep Learning Convolutional Neural Networks Using Video-Based Features

Khalid Mahboob; Hafsa Nizami; Fayyaz Ali; Farrukh Alvi

Conference Proceedings

Sentences Prediction Based on Automatic Lip-Reading Detection with Deep Learning Convolutional Neural Networks Using Video-Based Features

Communications in Computer and Information Science (2021) 1489 CCIS 42-53

DOI: 10.1007/978-981-16-7334-4_4

4Citations

9Readers

Get full text

Abstract

Lip-reading is the process of deciphering text from a speaker’s visual interpretation of facial, lip, and mouth movements without using audio. The challenge is traditionally divided into two stages: creating or learning visual characteristics and prediction. End-to-end techniques for deep lip-reading have been popular in recent years. Existing work on end-to-end models, on the other hand, only does word classification rather than sentence-level sequence prediction. Longer words improve human lip-reading ability, suggesting the relevance of characteristics that capture the temporal context in an inconsistent communication channel. In this study, an end-to-end model based on deep learning convolutional neural network shave been employed to develop an automated lip-reading system that uses a re-current network spatiotemporal convolutions, and the connectionist temporal classification loss to translate a variable-length series of video frames to text. The accuracy of the trained lip-reading process in predicting sentences was evaluated using video-based features.

Author supplied keywords

Cite

CITATION STYLE

APA

Mahboob, K., Nizami, H., Ali, F., & Alvi, F. (2021). Sentences Prediction Based on Automatic Lip-Reading Detection with Deep Learning Convolutional Neural Networks Using Video-Based Features. In Communications in Computer and Information Science (Vol. 1489 CCIS, pp. 42–53). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-16-7334-4_4

Sentences Prediction Based on Automatic Lip-Reading Detection with Deep Learning Convolutional Neural Networks Using Video-Based Features

Abstract

Author supplied keywords

Cite

Register to see more suggestions