Short-Term and Long-Term Context Aggregation Network for Video Inpainting

17Citations
Citations of this article
66Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Video inpainting aims to restore missing regions of a video and has many applications such as video editing and object removal. However, existing methods either suffer from inaccurate short-term context aggregation or rarely explore long-term frame information. In this work, we present a novel context aggregation network to effectively exploit both short-term and long-term frame information for video inpainting. In the encoding stage, we propose boundary-aware short-term context aggregation, which aligns and aggregates, from neighbor frames, local regions that are closely related to the boundary context of missing regions into the target frame (The target frame refers to the current input frame under inpainting.). Furthermore, we propose dynamic long-term context aggregation to globally refine the feature map generated in the encoding stage using long-term frame features, which are dynamically updated throughout the inpainting process. Experiments show that it outperforms state-of-the-art methods with better inpainting results and fast inpainting speed.

Author supplied keywords

Cite

CITATION STYLE

APA

Li, A., Zhao, S., Ma, X., Gong, M., Qi, J., Zhang, R., … Kotagiri, R. (2020). Short-Term and Long-Term Context Aggregation Network for Video Inpainting. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12349 LNCS, pp. 728–743). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-58548-8_42

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free