Continuous Sign Language Recognition Based on Spatial-Temporal Graph Attention Network

Qi Guo; Shujun Zhang; Hui Li

Journal ArticleOPEN ACCESS

Continuous Sign Language Recognition Based on Spatial-Temporal Graph Attention Network

CMES - Computer Modeling in Engineering and Sciences (2023) 134(3) 1653-1670

DOI: 10.32604/cmes.2022.021784

5Citations

10Readers

Abstract

Continuous sign language recognition (CSLR) is challenging due to the complexity of video background, hand gesture variability, and temporal modeling difficulties. This work proposes a CSLR method based on a spatial-temporal graph attention network to focus on essential features of video series. The method considers local details of sign language movements by taking the information on joints and bones as inputs and constructing a spatial-temporal graph to reflect inter-frame relevance and physical connections between nodes. The graph-based multi-head attention mechanism is utilized with adjacent matrix calculation for better local-feature exploration, and short-term motion correlation modeling is completed via a temporal convolutional network. We adopted BLSTM to learn the long-term dependence and connectionist temporal classification to align the word-level sequences. The proposed method achieves competitive results regarding word error rates (1.59%) on the Chinese Sign Language dataset and the mean Jaccard Index (65.78%) on the ChaLearn LAP Continuous Gesture Dataset.

Author supplied keywords

Cite

CITATION STYLE

APA

Guo, Q., Zhang, S., & Li, H. (2023). Continuous Sign Language Recognition Based on Spatial-Temporal Graph Attention Network. CMES - Computer Modeling in Engineering and Sciences, 134(3), 1653–1670. https://doi.org/10.32604/cmes.2022.021784

Continuous Sign Language Recognition Based on Spatial-Temporal Graph Attention Network

Abstract

Author supplied keywords

Cite

Register to see more suggestions