Deep neural networks have outperformed many traditional methods for action recognition on video datasets, such as UCF101 and HMDB51. This paper aims to explore the performance of fusion of different convolutional networks with different dimensions. The main contribution of this work is multi-modality fusion network (MMFN), a novel framework for action recognition, which combines 2D ConvNets and 3D ConvNets. The accuracy of MMFN outperforms the state-of-the-art deep-learning-based methods on the datasets of UCF101 (94.6%) and HMDB51 (69.7%).
CITATION STYLE
Huang, K., Qin, Z., Xu, K., Ye, S., & Wang, G. (2018). Multi-modality fusion network for action recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10736 LNCS, pp. 139–149). Springer Verlag. https://doi.org/10.1007/978-3-319-77383-4_14
Mendeley helps you to discover research relevant for your work.