Abstract
Objective Pedestrian detection is a widely concerned topic in computer vision tasks. It is also a basic and critical technology in automatic driving assistance systems,visual surveillance,and behavior recognition. In the traffic environment,pedestrians and cyclists belong to the“vulnerable groups on the road”. The World Health Organization(WHO)statistics show that approximately half of all fatalities in road accidents involve pedestrians. Unlike conventional detection objects(e. g. ,automobiles)with relatively stable structural characteristics,different limb activities of pedestrians exhibit the nonrigid characteristic of structural instability,thereby complicating pedestrian detection. Moreover,the night scene is difficult to navigate. However,insufficient domestic and international research on night pedestrian detection is currently lacking. Given insufficient illumination and local overexposure,pedestrian recognition algorithms are vulnerable to accuracy restrictions,leading to missing and incorrect detections. Therefore,nighttime pedestrian detection technology has important research and social value for ensuring pedestrian safety. Method The monitoring conditions at night are constrained by uneven and insufficient lighting. Thus,the acquired photos have inadequate exposure,which reduces the effectiveness of pedestrian detection. The present study suggests adding a low-light enhancement module(Zero-DCE)to the detector to boost the model’s nighttime detection performance and address the issue. We feed the regression loss of the detector and the detection location information to the low-light enhancement module for the joint training of the low-light image enhancement and pedestrian detection tasks to make the low-light image enhancement act as a positive gain for the pedestrian detection task. This approach maintains the regional continuity of pedestrian features in the image and avoids the degradation of detection accuracy caused by the pixel-level low-light enhancement operation that destroys the features in the pedestrian region. Pedestrian detection has a long history. In recent years,pedestrian detection strategies using histograms of oriented gradients(HOG)to model human features with a support vector machine(SVM)as a feature classifier have been widely studied. However,the traditional pedestrian detection methods are based on feature engineering. Moreover,the hand-crafted features have low accuracy and are not generalizable. In recent years,deep learning algorithms have started to be used for pedestrian detection tasks. The convolutional neural network(CNN)can extract high-level features and gradually becomes the mainstream pedestrian detection method. On the basis of whether the detection algorithm is based on region proposal,deep learning-based pedestrian detection algorithms can be broadly divided into two-stage and one-stage methods. Two-stage methods first use sliding windows to find preselected regions in the image. Then,the regions and the representative are classified and regressed. The representative methods are R-CNN and Faster R-CNN. The detection algorithm based on the region proposal can capture rich features. Thus,the detection accuracy is high. However,problems,such as redundancy of preselected regions and slow inference speed,exist. One-stage methods do not base on region proposal. However,they directly regress the target’s position in the image,thereby simplifying the detection process and accelerating inference speed. The representative methods are single shot multibox detector(SSD),you only look once v3(YOLOv3),and YOLOX,proposed by MEGVII. In this study,the one-stage method YOLOX is finally selected as the baseline model for the consideration of detection accuracy and inference speed. The targeted optimization is performed for night scenes on the baseline. Additionally,a significant issue with pedestrian detection is the missing and incorrect detection brought on by interclass occlusion and dense crowds. The original non-maximum suppression(NMS)algorithm is susceptible to falsely deleting the detection box when numerous pedestrians are present and their distribution is concentrated. This scenario leads to pedestrian missing detection. Aiming at this problem,the present study reconsiders the NMS strategy in the model reasoning stage and introduces a nonmaximum suppression algorithm(nearby object hallucinatory(NOH))that adds the distribution information of nearby pedestrian targets. We eliminate the dependence of NOH on region proposals,allowing it to be ported to the one-stage target detection algorithm. The bounding box features predicted by YOLOX are pooled into the same feature space. Then,we use a simple full connection module to build the location distribution and density information of nearby pedestrians required by NOH. The improved NOH module is combined with the original YOLOHead as Pedestrian-Head to obtain the final pedestrian detection information. We determine through experiments that adding such a full connection module effectively reduces the missing detection problem caused by occlusion,and the reasoning speed is slightly improved. However,full connection modules inevitably bring redundant parameters to the network. Therefore,this study further investigates the reduction of model volume. Deep separable convolution is also used in the lightweight model to maintain the accuracy of model detection and reduce the computational power required for reasoning. The floating-point computation of the lightweight model is reduced to 22.4 GFLOPs. In theory,our algorithm can meet the needs of real-time reasoning of mobile devices. Result We divided the ablation experiments into three groups for verification on the NightSurveillance dataset. Compared with the baseline model(YOLOX),NSPDet increased the average precision(AP)and the average recall(AR)indices by 10.1 and 7.2,respectively. In addition,the parameters of the lightweight NSPDet model are reduced by 16.4 M. The AP attenuation and AR attenuation are 7.6 and 6.2,respectively. However,the lightweight NSPDet model is still better than the baseline model. The comparison experiments of other methods on Caltech,CityPersons,and NightOwls datasets show that the night pedestrian detection algorithm proposed in this study has a low average false detection rate. Conclusion The NSPDet algorithm proposed in this study improves the accuracy of the baseline model for pedestrian detection at night. The proposed algorithm also has the performance of real-time reasoning. This study optimizes the accuracy of the baseline model for pedestrian detection in various complex nighttime scenes,including low light,strong light interference,image blur,occlusion,and rainy weather. It has an important application value for promoting research in autonomous driving and intelligent transportation.
Author supplied keywords
Cite
CITATION STYLE
Gong, A., Li, Z., & Liang, C. (2023). NSPDet:real-time nearby-aware pedestrian detection algorithm for multi-scene surveillance at night. Journal of Image and Graphics, 28(9), 2693–2705. https://doi.org/10.11834/jig.220834
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.