Motion-Based Occlusion-Aware Pixel Graph Network for Video Object Segmentation

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper proposes a dual-channel based Graph Convolutional Network (GCN) for the Video Object Segmentation (VOS) task. The main contribution lies in formulating two pixel graphs based on the raw RGB and optical flow features. Both spatial and temporal features are learned independently, making the network robust to various challenging scenarios in real-world videos. Additionally, a motion orientation-based aggregator scheme efficiently captures long-range dependencies among objects. This not only deals with the complex issue of modelling velocity differences among multiple objects moving in various directions, but also adapts to change of appearance of objects due to pose and scale deformations. Also, an occlusion-aware attention mechanism has been employed to facilitate accurate segmentation under scenarios where multiple objects have temporal discontinuity in their appearance due to occlusion. Performance analysis on DAVIS-2016 and DAVIS-2017 datasets show the effectiveness of our proposed method in foreground segmentation of objects in videos over the existing state-of-the-art techniques. Control experiments using CamVid dataset show the generalising capability of the model for scene segmentation.

Cite

CITATION STYLE

APA

Adak, S., & Das, S. (2019). Motion-Based Occlusion-Aware Pixel Graph Network for Video Object Segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11954 LNCS, pp. 516–527). Springer. https://doi.org/10.1007/978-3-030-36711-4_43

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free