Meta Triplet Learning for Multiview Sign Language Recognition

2Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

Abstract

Multiview video processing for recognition is a hard problem if the subject is in continuous motion. Especially the problem becomes even tougher when the subject in question is a human being and the actions to be recognized from the video data are sign language. Although many deep learning models have been successfully applied for sign language recognition (SLR), very few have considered multiple views in their training set. In this work, we propose to apply meta metric learning for video-based sign language recognition. Contrasting to traditional metric learning where the triplet loss is constructed on the sample-based distances, the meta metric learns on the setbased distances. Consequently, we construct meta cells on the entire multiview dataset and perform a task-based learning approach with respect to support cells and query sets. Additionally, we propose a maximum view pooled distance on sub-tasks for binding intraclass views. The results of experiments conducted on the multiview sign language dataset and four action datasets show that the proposed multiview meta metric learning model (MVDMML) achieves 11% higher performance than the baselines.

Cite

CITATION STYLE

APA

Mopidevi, S., Prasad, M. V. D., Polurie, V. V. K., & Dande, A. K. (2023). Meta Triplet Learning for Multiview Sign Language Recognition. International Journal of Intelligent Engineering and Systems, 16(2), 375–388. https://doi.org/10.22266/ijies2023.0430.30

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free