Residual Transformer YOLO for Detecting Multi-Scale Crowded Pedestrian

Hechao Ye; Yanni Wang

Journal ArticleOPEN ACCESS

Residual Transformer YOLO for Detecting Multi-Scale Crowded Pedestrian

Applied Sciences (Switzerland) (2023) 13(21)

DOI: 10.3390/app132112032

18Citations

12Readers

Abstract

Crowding and occlusion pose significant challenges for pedestrian detection, which can easily lead to missed and false detections for small-scale and occluded pedestrian objects in dense pedestrian scenarios. To enhance dense pedestrian detection accuracy, we propose the Residual Transformer YOLO (RT-YOLO) algorithm in this paper. The RT-YOLO algorithm enhances the multi-scale fusion strategy based on YOLOv7 and introduces a dedicated detection layer for small-scale occluded targets. It also integrates Resnet and Transformer structures to improve the small-scale feature layer and detection head, enhancing feature extraction capabilities. Additionally, the RT-YOLO algorithm incorporates the Normalization-based Attention Module (NAM) into the backbone and neck networks to identify the region of interest. The experiments demonstrate that on the CrowdHuman and WiderPerson datasets, at IOU (Intersection over Union) = 0.5, the overall improvement in (Formula presented.) is 3.8% and 3.4%. In the IOU range from 0.5 to 1, the improvement in (Formula presented.) : 95 is 5.1% and 4%. RT-YOLO achieves an FPS of 67, maintaining real-time performance. On the VOC2007 dataset, (Formula presented.) has been enhanced by 5.1%, indicating higher effectiveness and robustness.

Author supplied keywords

Cite

CITATION STYLE

APA

Ye, H., & Wang, Y. (2023). Residual Transformer YOLO for Detecting Multi-Scale Crowded Pedestrian. Applied Sciences (Switzerland), 13(21). https://doi.org/10.3390/app132112032

Residual Transformer YOLO for Detecting Multi-Scale Crowded Pedestrian

Abstract

Author supplied keywords

Cite

Register to see more suggestions