Abstract
Human instance segmentation is a core problem for human-centric scene understanding and segmenting human instances poses a unique challenge to vision systems due to large intra-class variations in both appearance and shape, and complicated occlusion patterns. In this paper, we propose a new pose-aware human instance segmentation method. Compared to the previous pose-aware methods which first predict bottom-up poses and then estimate instance segmentation on top of predicted poses, our method integrates both top-down and bottom-up cues for an instance: it adopts detection results as human proposals and jointly estimates human pose and instance segmentation for each proposal. We develop a modular recurrent deep network that utilizes pose estimation to refine instance segmentation in an iterative manner. Our refinement modules exploit pose cues in two levels: as a coarse shape prior and local part attention. We evaluate our approach on two public multi-person benchmarks: OCHuman dataset and COCOPersons dataset. The proposed method surpasses the state-of-the-art methods on OCHuman dataset by 3.0 maP and on COCOPersons by 6.4 maP, demonstrating the effectiveness of our approach.
Author supplied keywords
Cite
CITATION STYLE
Zhou, D., & He, Q. (2020). PoSeg: Pose-aware Refinement Network for Human Instance Segmentation. IEEE Access, 8, 15007–15016. https://doi.org/10.1109/aCCESS.2020.2967147
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.