Graph convolutional networks (GCNs) have achieved outstanding performances on skeleton-based action recognition. However, several problems remain in existing GCN-based methods, and the spatial-temporal features are not discriminative enough. Temporal convolution with one fixed kernel cannot obtain more discriminative temporal features for different actions. Besides, only a single-scale feature is used for classification, which ignores the multilevel information. In this article, we propose a novel multi-scale and multi-stream improved graph convolutional network (MM-IGCN). In each spatial-temporal block of MM-IGCN, we employ an improved temporal convolution with multiple parallel kernels to enhance the temporal features. An improved GCN and an enhanced attention module are adopted in the block to strengthen spatial-temporal features. A multi-scale structure is first introduced in action recognition to obtain the multilevel information. The improved spatial-temporal blocks and multi-scale structure compose our single-stream model. Moreover, we adopt the bone cosine distance as a novel input feature. Five streams (joint, bone, their motions, and bone cosine distance) of features are fed into our single-stream model respectively, which compose our MM-IGCN. Experiments on two large datasets, NTU-RGB+D and NTU-RGB+D-120, illustrate that our single-stream model achieves state-of-the-art, and our MM-IGCN is far superior to other models.
CITATION STYLE
Li, W., Liu, X., Liu, Z., Du, F., & Zou, Q. (2020). Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network. IEEE Access, 8, 144529–144542. https://doi.org/10.1109/ACCESS.2020.3014445
Mendeley helps you to discover research relevant for your work.