YOLF-ShipPnet: Improved RetinaNet with Pyramid Vision Transformer

18Citations
Citations of this article
15Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

In the field of ship detection, the intricate nature of ship images arises from a multitude of factors, including variations in ship orientation, color contrasts, and diverse shapes. These factors collectively contribute to the challenge of achieving high detection precision. Thus, it is necessary to investigate the application of advanced networks for ship image detection. In this paper, we have put forward an improved network called YOLF-ShipPnet, which utilizes a popular pyramid vision transformer with increased depth as the backbone for the RetinaNet network. To increase the model’s generalization ability, You Only Look Once eXtreme’s (YOLOX’s) hue, saturation, and value (HSV) random augmentation technique is employed to simulate light and color effects on ship images during the construction of the network. Ablation experiments were conducted on the model with two popular datasets: High-Resolution Ship Collections 2016 (HRSC2016) and SAR Ship Detection Dataset (SSDD). The YOLF-ShipPnet network has been verified to improve detection precision and generalization ability in ship detection by 5.22 % and 5.46 % , respectively, compared to RetinaNet baseline, exhibiting strong robustness and high effectiveness. The proposed network is applicable to the field of fine-grained ship detection and achieves an accuracy improvement of 10.03 % compared to the baseline network.

Cite

CITATION STYLE

APA

Qiu, Z., Rong, S., & Ye, L. (2023). YOLF-ShipPnet: Improved RetinaNet with Pyramid Vision Transformer. International Journal of Computational Intelligence Systems, 16(1). https://doi.org/10.1007/s44196-023-00235-4

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free