Relational prototypical network for weakly supervised temporal action localization

73Citations
Citations of this article
37Readers
Mendeley users who have this article in their library.

Abstract

In this paper, we propose a weakly supervised temporal action localization method on untrimmed videos based on prototypical networks. We observe two challenges posed by weakly supervision, namely action-background separation and action relation construction. Unlike the previous method, we propose to achieve action-background separation only by the original videos. To achieve this, a clustering loss is adopted to separate actions from backgrounds and learn intra-compact features, which helps in detecting complete action instances. Besides, a similarity weighting module is devised to further separate actions from backgrounds. To effectively identify actions, we propose to construct relations among actions for prototype learning. A GCN-based prototype embedding module is introduced to generate relational prototypes. Experiments on THUMOS14 and ActivityNet1.2 datasets show that our method outperforms the state-of-the-art methods.

Cite

CITATION STYLE

APA

Huang, L., Huang, Y., Ouyang, W., & Wang, L. (2020). Relational prototypical network for weakly supervised temporal action localization. In AAAI 2020 - 34th AAAI Conference on Artificial Intelligence (pp. 11053–11060). AAAI press. https://doi.org/10.1609/aaai.v34i07.6760

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free