Due to the ever-changing shape and scale of ships, as well as the complex sea background, accurately detecting multi-scale ships on the sea while considering real-time requirements remains a challenge. To address this problem, we propose a model called S-DETR based on the DETR framework for end-to-end detection of ships on the sea. A scale attention module is designed to effectively learn the weights of different scale information by utilizing the global information brought by global average pooling. We analyzed the potential reasons for the performance degradation of the end-to-end detector and proposed a decoder based on Dense Query. Although the computational complexity and convergence of the entire S-DETR model have not been rigorously proven mathematically, Dense Query can reduce the computational complexity of multi-head self-attention from (Formula presented.) into (Formula presented.). To evaluate the performance of S-DETR, we conducted experiments on the Singapore Maritime Dataset and Marine Image Dataset. The experimental results show that the proposed method can effectively solve the problem of multi-scale ship detection in complex marine environments and achieve state-of-the-art performance. The model inference speed of S-DETR is comparable to that of single-stage target detection models and meets the real-time requirements of shoreside ship detection.
CITATION STYLE
Xing, Z., Ren, J., Fan, X., & Zhang, Y. (2023). S-DETR: A Transformer Model for Real-Time Detection of Marine Ships. Journal of Marine Science and Engineering, 11(4). https://doi.org/10.3390/jmse11040696
Mendeley helps you to discover research relevant for your work.