Recent multi-person pose estimation networks rely on sequential downsampling and upsampling procedures to capture multi-scale features and stacking basic modules to reassess local and global contexts. However, the network parameters become huge and difficult to be trained under limited computational resource. Motived by this observation, we design a lite version of Hourglass module that uses hybrid convolution blocks to reduce the number of parameters while maintaining performance. The hybrid convolution block builds multi-context paths with dilated convolutions with different rates which not only reduces the number of parameters but also enlarges the receptive field. Moreover, due to the limitation of heatmap representation, the networks need extra and non-differentiable post-processing to convert heatmaps to keypoint coordinates. Therefore, we propose a simple and efficient operation based on integral loss to fill this gap specifically for bottom-up pose estimation methods. We demonstrate that the proposed approach achieves better performance than the baseline methods on the challenge benchmark MSCOCO dataset for multi-person pose estimation.
CITATION STYLE
Zhao, Y., Luo, Z., Quan, C., Liu, D., & Wang, G. (2020). Lite Hourglass Network for Multi-person Pose Estimation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11962 LNCS, pp. 226–238). Springer. https://doi.org/10.1007/978-3-030-37734-2_19
Mendeley helps you to discover research relevant for your work.