Fine-grained Semantic Alignment Network for Weakly Supervised Temporal Language Grounding

Yuechen Wang; Wengang Zhou; Houqiang Li

Conference ProceedingsOPEN ACCESS

Fine-grained Semantic Alignment Network for Weakly Supervised Temporal Language Grounding

Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (2021) 89-99

DOI: 10.18653/v1/2021.findings-emnlp.9

11Citations

50Readers

Abstract

Temporal language grounding (TLG) aims to localize a video segment in an untrimmed video based on a natural language description. To alleviate the expensive cost of manual annotations for temporal boundary labels, we are dedicated to the weakly supervised setting, where only video-level descriptions are provided for training. Most of the existing weakly supervised methods generate a candidate segment set and learn cross-modal alignment through a MIL-based framework. However, the temporal structure of the video as well as the complicated semantics in the sentence are lost during the learning. In this work, we propose a novel candidatefree framework: Fine-grained Semantic Alignment Network (FSAN), for weakly supervised TLG. Instead of view the sentence and candidate moments as a whole, FSAN learns token-by-clip cross-modal semantic alignment by an iterative cross-modal interaction module, generates a fine-grained cross-modal semantic alignment map, and performs grounding directly on top of the map. Extensive experiments are conducted on two widelyused benchmarks: ActivityNet-Captions, and DiDeMo, where our FSAN achieves state-ofthe-art performance.

Cite

CITATION STYLE

APA

Wang, Y., Zhou, W., & Li, H. (2021). Fine-grained Semantic Alignment Network for Weakly Supervised Temporal Language Grounding. In Findings of the Association for Computational Linguistics, Findings of ACL: EMNLP 2021 (pp. 89–99). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-emnlp.9

Fine-grained Semantic Alignment Network for Weakly Supervised Temporal Language Grounding

Abstract

Cite

Register to see more suggestions