Asymmetric Relation Consistency Reasoning for Video Relation Grounding

Huan Li; Ping Wei; Jiapeng Li; Zeyu Ma; Jiahui Shang; Nanning Zheng

Conference Proceedings

Asymmetric Relation Consistency Reasoning for Video Relation Grounding

Li H
Wei P
Li J
et al.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13695 LNCS 125-141

DOI: 10.1007/978-3-031-19833-5_8

1Citations

7Readers

Get full text

Abstract

Video relation grounding has attracted growing attention in the fields of video understanding and multimodal learning. While the past years have witnessed remarkable progress in this issue, the difficulties of multi-instance and complex temporal reasoning make it still a challenging task. In this paper, we propose a novel Asymmetric Relation Consistency (ARC) reasoning model to solve the video relation grounding problem. To overcome the multi-instance confusion problem, an asymmetric relation reasoning method and a novel relation consistency loss are proposed to ensure the consistency of the relationships across multiple instances. In order to precisely localize the relation instance in temporal context, a transformer-based relation reasoning module is proposed. Our model is trained in a weakly-supervised manner. The proposed method was tested on the challenging video relation dataset. Experiments manifest that the performance of our method outperforms the state-of-the-art methods by a large margin. Extensive ablation studies also prove the effectiveness and strength of the proposed method.

Author supplied keywords

Cite

CITATION STYLE

APA

Li, H., Wei, P., Li, J., Ma, Z., Shang, J., & Zheng, N. (2022). Asymmetric Relation Consistency Reasoning for Video Relation Grounding. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13695 LNCS, pp. 125–141). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19833-5_8

Asymmetric Relation Consistency Reasoning for Video Relation Grounding

Abstract

Author supplied keywords

Cite

Register to see more suggestions