Nowadays, it is imperative to accurately detect and repair face flaws to obtain high-resolution, detail-preserving face images in social life. However, there exists a significant gap between face flaws and natural scene features. In this paper, we propose a framework of HRT-YOLO to erase the problems of small scale, irregular shape, and regional overlap of face flaws. Based on YOLOv5, we introduce high-resolution representation, where the main network of high-resolution feature maps is gradually added in parallel to the sub-network of low-resolution feature maps, allowing for more accurate localization of flaws. We also implement transformer detection heads, which use a multi-head attention to mine contextual features and improve the defect-intensive regions detection. Finally, we present a high-resolution face flaw dataset to evaluate our approach. The evaluation results show that HRT-YOLO has excellent performance on face flaw detection, with 3.23% improvement in mAP and 5.33% improvement in accuracy compared to the baseline (YOLOv5).
CITATION STYLE
Sun, Y., Li, X., & Zhang, X. (2022). HRT-YOLO: A transformer-based high-resolution representation model for face flaw detection. In Journal of Physics: Conference Series (Vol. 2258). Institute of Physics. https://doi.org/10.1088/1742-6596/2258/1/012032
Mendeley helps you to discover research relevant for your work.