Our 'You Only Move Once' (YOMO) detector based on depthwise separable convolutions is a single stage face detector that balances accuracy and latency. YOMO performs scale-invariantly by utilizing top-down architecture with feature agglomeration, and multiple detection modules instead of in an image pyramid approach. At the same time, we propose a semi-soft random cropping algorithm that enables different detection module adequately trained by different scales of samples. Several experiments are conducted on the FDDB dataset with discrete and continuous measures, to demonstrate that the methods have strongly competitiveness results. After using an ellipses regressor, the recall rates reached a satisfactory 97.59% and 83.66%, respectively. Surprisingly, YOMO has only 21 million parameters and achieves superior performance with 51 frames per second (FPS) for a 544×544 input image on a GPU.
CITATION STYLE
Xu, J., Tian, Y., Wu, H., Luo, B., & Guo, J. (2019). You only Move Once: An Efficient Convolutional Neural Network for Face Detection. IEEE Access, 7, 169528–169536. https://doi.org/10.1109/ACCESS.2019.2954936
Mendeley helps you to discover research relevant for your work.