In this paper, we designed a system called LipSpeaker to help acquired voice disorder people to communicate in daily life. Acquired voice disorder users only need to face the camera on their smartphones, and then use their lips to imitate the pronunciation of the words. LipSpeaker can recognize the movements of the lips and convert them to texts, and then it generates audio to play. Compared to texts, mel-spectrogram is more emotionally informative. In order to generate smoother and more emotional audio, we also use the method of predicting mel-spectrogram instead of texts through recognizing users’ lip movements and expression together.
CITATION STYLE
Chen, Y., Zhang, J., Zhang, Y., & Ochiai, Y. (2019). LipSpeaker: Helping Acquired Voice Disorders People Speak Again. In Communications in Computer and Information Science (Vol. 1088, pp. 143–148). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-30712-7_19
Mendeley helps you to discover research relevant for your work.