YOLOv11-MSE: A Multi-Scale Dilated Attention-Enhanced Lightweight Network for Efficient Real-Time Underwater Target Detection

3Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

Underwater target detection is a critical technology for marine resource management and ecological protection, but its performance is often limited by complex underwater environments, including optical attenuation, scattering, and dense distributions of small targets. Existing methods have significant limitations in feature extraction efficiency, robustness in class-imbalanced scenarios, and computational complexity. To address these challenges, this study proposes a lightweight adaptive detection model, YOLOv11-MSE, which optimizes underwater detection performance through three core innovations. First, a multi-scale dilated attention (MSDA) mechanism is embedded into the backbone network to dynamically capture multi-scale contextual features while suppressing background noise. Second, a Slim-Neck architecture based on GSConv and VoV-GSCSPC modules is designed to achieve efficient feature fusion via hybrid convolution strategies, significantly reducing model complexity. Finally, an efficient multi-scale attention (EMA) module is introduced in the detection head to reinforce key feature representations and suppress environmental noise through cross-dimensional interactions. Experiments on the underwater detection dataset (UDD) demonstrate that YOLOv11-MSE outperforms the baseline model YOLOv11, achieving a 9.67% improvement in detection precision and a 3.45% increase in mean average precision (mAP50) while reducing computational complexity by 6.57%. Ablation studies further validate the synergistic optimization effects of each module, particularly in class-imbalanced scenarios where detection precision for rare categories (e.g., scallops) is significantly enhanced, with precision and mAP50 improving by 60.62% and 10.16%, respectively. This model provides an efficient solution for edge computing scenarios, such as underwater robots and ecological monitoring, through its lightweight design and high underwater target detection capability.

Cite

CITATION STYLE

APA

Ye, Z., Peng, X., Li, D., & Shi, F. (2025). YOLOv11-MSE: A Multi-Scale Dilated Attention-Enhanced Lightweight Network for Efficient Real-Time Underwater Target Detection. Journal of Marine Science and Engineering, 13(10). https://doi.org/10.3390/jmse13101843

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free