Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

Li Xu; Haoxuan Qu; Jason Kuen; Jiuxiang Gu; Jun Liu

Conference Proceedings

Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13687 LNCS 374-390

DOI: 10.1007/978-3-031-19812-0_22

10Citations

17Readers

Get full text

Abstract

Video scene graph generation (VidSGG) aims to parse the video content into scene graphs, which involves modeling the spatio-temporal contextual information in the video. However, due to the long-tailed training data in datasets, the generalization performance of existing VidSGG models can be affected by the spatio-temporal conditional bias problem. In this work, from the perspective of meta-learning, we propose a novel Meta Video Scene Graph Generation (MVSGG) framework to address such a bias problem. Specifically, to handle various types of spatio-temporal conditional biases, our framework first constructs a support set and a group of query sets from the training data, where the data distribution of each query set is different from that of the support set w.r.t. a type of conditional bias. Then, by performing a novel meta training and testing process to optimize the model to obtain good testing performance on these query sets after training on the support set, our framework can effectively guide the model to learn to well generalize against biases. Extensive experiments demonstrate the efficacy of our proposed framework.

Author supplied keywords

Cite

CITATION STYLE

APA

Xu, L., Qu, H., Kuen, J., Gu, J., & Liu, J. (2022). Meta Spatio-Temporal Debiasing for Video Scene Graph Generation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13687 LNCS, pp. 374–390). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-19812-0_22

Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

Abstract

Author supplied keywords

Cite

Register to see more suggestions