Visual language identification from facial landmarks

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

The automatic Visual Language IDentification (VLID), i.e. a problem of using visual information to identify the language being spoken, using no audio information, is studied. The proposed method employs facial landmarks automatically detected in a video. A convex optimisation problem to find jointly both the discriminative representation (a soft-histogram over a set of lip shapes) and the classifier is formulated. A 10-fold cross-validation is performed on dataset consisting of 644 videos collected from youtube.com resulting in accuracy 73% in a pairwise discrimination between English and French (50% for a chance). A study, in which 10 videos were used, suggests that the proposed method performs better than average human in discriminating between the languages.

Cite

CITATION STYLE

APA

Špetlík, R., Čech, J., Franc, V., & Matas, J. (2017). Visual language identification from facial landmarks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10270 LNCS, pp. 389–400). Springer Verlag. https://doi.org/10.1007/978-3-319-59129-2_33

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free