Abstract
Single-frame infrared small target detection is still a challenging task due to the complex background and unobvious structural characteristics of small targets. Recently, convolutional neural networks (CNN) began to appear in the field of infrared small target detection and have been widely used for excellent performance. However, existing CNN-based methods mainly focus on local spatial features while ignoring the long-range contextual dependencies between small targets and backgrounds. To capture the global context-aware information, we propose fusion network architecture of transformer and CNN (FTC-Net), which consists of two branches. The CNN-based branch uses a U-Net with skip connections to obtain low-level local details of small targets. The transformer-based branch applies hierarchical self-attention mechanisms to learn long-range contextual dependencies. Specifically, the transformer branch can suppress background interferences and enhance target features. To obtain local and global feature representation, we design a feature fusion module to realize the feature concentration of two branches. We implement ablation and comparative experiments on a publicly accessed SIRST dataset. Experimental results show that the transformer-based branch is effective and suggest the superiority of the proposed FTC-Net compared with other state-of-the-art methods.
Author supplied keywords
Cite
CITATION STYLE
Qi, M., Liu, L., Zhuang, S., Liu, Y., Li, K., Yang, Y., & Li, X. (2022). FTC-Net: Fusion of Transformer and CNN Features for Infrared Small Target Detection. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15, 8613–8623. https://doi.org/10.1109/JSTARS.2022.3210707
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.