Cloud image segmentation plays an important role in ground-based cloud observation. Recently, most existing methods for ground-based cloud image segmentation learn feature representations using the convolutional neural network (CNN), which results in the loss of global information because of the limited receptive field size of the filters in the CNN. In this article, we propose a novel deep model named TransCloudSeg, which makes full use of the advantages of the CNN and transformer to extract detailed information and global contextual information for ground-based cloud image segmentation. Specifically, TransCloudSeg hybridizes the CNN and transformer as the encoders to obtain different features. To recover and fuse the feature maps from the encoders, we design the CNN decoder and the transformer decoder for TransCloudSeg. After obtaining two sets of feature maps from two different decoders, we propose the heterogeneous fusion module to effectively fuse the heterogeneous feature maps by applying the self-attention mechanism. We conduct a series of experiments on Tianjin Normal University large-scale cloud detection database and Tianjin Normal University cloud detection database, and the results show that our method achieves a better performance than other state-of-the-art methods, thus proving the effectiveness of the proposed TransCloudSeg.
CITATION STYLE
Liu, S., Zhang, J., Zhang, Z., Cao, X., & Durrani, T. S. (2022). TransCloudSeg: Ground-Based Cloud Image Segmentation With Transformer. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15, 6121–6132. https://doi.org/10.1109/JSTARS.2022.3194316
Mendeley helps you to discover research relevant for your work.