Action recognition is an important yet challenging task in computer vision. In this paper, we propose a novel deep-based framework for action recognition, which improves the recognition accuracy by: 1) deriving more precise features for representing actions, and 2) reducing the asynchrony between different information streams. We first introduce a coarse-to-fine network which extracts shared deep features at different action class granularities and progressively integrates them to obtain a more accurate feature representation for input actions. We further introduce an asynchronous fusion network. It fuses information from different streams by asynchronously integrating stream-wise features at different time points, hence better leveraging the complementary information in different streams. Experimental results on action recognition benchmarks demonstrate that our approach achieves the state-of-the-art performance.
CITATION STYLE
Lin, W., Mi, Y., Wu, J., Lu, K., & Xiong, H. (2018). Action recognition with coarse-to-fine deep feature integration and asynchronous fusion. In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018 (pp. 7130–7137). AAAI press. https://doi.org/10.1609/aaai.v32i1.12232
Mendeley helps you to discover research relevant for your work.