The main contribution of this paper is an approach for introducing more context to improve the accuracy of the traditional SSD (Single Shot Multibox Detector), which is one of the top object detection algorithms in both aspects of accuracy and speed. We augment SSD with a multi-level feature fusion method at shallow layers for introducing contextual information to improve accuracy, especially for the detection of small objects, calling our resulting system SFSSD for shallow feature fusion single shot multibox detector. In the feature fusion module, features from different layers with different scales are concatenated together, followed by some down-sampling blocks to generate new feature pyramid which will be fed to multibox detectors to predict the final detection results. For the Pascal VOC2007 test set trained with VOC2007 and VOC2012 training sets, the proposed network with the input size of 300 300 achieved 75.4 mAP (mean average precision), while the network with 512 512 sized input achieved 79.7 mAP. Our SFSSD shows state-of-the-art mAP, which is better than those of the conventional SSD, Fast R-CNN, Faster-RCNN, ION and MR-CNN.
CITATION STYLE
Wang, D., Zhang, B., Cao, Y., & Lu, M. (2020). SFSSD: Shallow Feature Fusion Single Shot Multibox Detector. In Lecture Notes in Electrical Engineering (Vol. 571 LNEE, pp. 2590–2598). Springer. https://doi.org/10.1007/978-981-13-9409-6_316
Mendeley helps you to discover research relevant for your work.