Weakly-supervised action recognition and localization via knowledge transfer

1Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Action recognition and localization has attracted much attention in the past decade. However, a challenging problem is that it typically requires large-scale temporal annotations of action instances for training models in untrimmed video scenarios, which is not practical in many real-world applications. To alleviate the problem, we propose a novel weakly-supervised action recognition framework for untrimmed videos to use only video-level annotations to transfer information from publicly available trimmed videos to assist in model learning, namely KTUntrimmedNet. A two-stage method is designed to guarantee an effective transfer strategy: Firstly, the trimmed and untrimmed videos are clustered to find similar classes between them, so as to avoid negative information transfer from trimmed data. Secondly, we design an invariant module to find common features between trimmed videos and untrimmed videos for improving the performance. Extensive experiments on the standard benchmark datasets, THUMOS14 and ActivityNet1.3, clearly demonstrate the efficacy of our proposed method when compared with the existing state-of-the-arts.

Cite

CITATION STYLE

APA

Shi, H., Zhang, X., & Li, C. (2019). Weakly-supervised action recognition and localization via knowledge transfer. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11857 LNCS, pp. 205–216). Springer. https://doi.org/10.1007/978-3-030-31654-9_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free