Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments

Khoi D. Nguyen; Quoc Huy Tran; Khoi Nguyen; Binh Son Hua; Rang Nguyen

Conference Proceedings

Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13680 LNCS 471-487

DOI: 10.1007/978-3-031-20044-1_27

10Citations

16Readers

Get full text

Abstract

We present a novel method for few-shot video classification, which performs appearance and temporal alignments. In particular, given a pair of query and support videos, we conduct appearance alignment via frame-level feature matching to achieve the appearance similarity score between the videos, while utilizing temporal order-preserving priors for obtaining the temporal similarity score between the videos. Moreover, we introduce a few-shot video classification framework that leverages the above appearance and temporal similarity scores across multiple steps, namely prototype-based training and testing as well as inductive and transductive prototype refinement. To the best of our knowledge, our work is the first to explore transductive few-shot video classification. Extensive experiments on both Kinetics and Something-Something V2 datasets show that both appearance and temporal alignments are crucial for datasets with temporal order sensitivity such as Something-Something V2. Our approach achieves similar or better results than previous methods on both datasets. Our code is available at https://github.com/VinAIResearch/fsvc-ata.

Author supplied keywords

Cite

CITATION STYLE

APA

Nguyen, K. D., Tran, Q. H., Nguyen, K., Hua, B. S., & Nguyen, R. (2022). Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13680 LNCS, pp. 471–487). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-20044-1_27

Inductive and Transductive Few-Shot Video Classification via Appearance and Temporal Alignments

Abstract

Author supplied keywords

Cite

Register to see more suggestions