In this paper, we propose a new method for human parsing, which effectively maintains high-resolution representations and leverages body edge details to improve the performance. First, we propose a hybrid resolution network (HyRN) for human parsing and body edge detection. In our HyRN, we adopt deconvolution operation and auxiliary supervision to increase the discrimination ability of features from each scale. Second, considering the close relationship between human parsing and body edge detection, we propose a dual-task cascaded framework (DTCF), which implicitly integrates parsing and edge features to progressively refine the parsing results. Third, we develop an edge guided region mutual information loss, which uses the edge detection results to explicitly maintain the high order consistency between parsing prediction and ground truth around body edge pixels. When evaluated on standard benchmarks, our proposed HyRN achieves competitive accuracy compared with state-of-the-art human parsing methods. Moreover, our DTCF further improves the performance and outperforms the established baseline approach by 3.42 points w.t.r mIoU on the LIP dataset.
CITATION STYLE
Liu, Y., Zhao, L., Zhang, S., & Yang, J. (2020). Hybrid Resolution Network Using Edge Guided Region Mutual Information Loss for Human Parsing. In MM 2020 - Proceedings of the 28th ACM International Conference on Multimedia (pp. 1670–1678). Association for Computing Machinery, Inc. https://doi.org/10.1145/3394171.3413831
Mendeley helps you to discover research relevant for your work.