Abstract
Earth observation data have huge potential to enrich our knowledge about our planet. An important step in many Earth observation tasks is semantic segmentation. Generally, a large number of pixelwise labeled images are required to train deep models for supervised semantic segmentation. On the contrary, strong intersensor and geographic variations impede the availability of annotated training data in Earth observation. In practice, most Earth observation tasks use only the target scene without assuming availability of any additional scene, labeled or unlabeled. Keeping in mind such constraints, we propose a semantic segmentation method that learns to segment from a single scene, without using any annotation. Earth observation scenes are generally larger than those encountered in typical computer vision datasets. Exploiting this, the proposed method samples smaller unlabeled patches from the scene. For each patch, an alternate view is generated by simple transformations, e.g., addition of noise. Both views are then processed through a two-stream network and weights are iteratively refined using deep clustering, spatial consistency, and contrastive learning in the pixel space. The proposed model automatically segregates the major classes present in the scene and produces the segmentation map. Extensive experiments on four Earth observation datasets collected by different sensors show the effectiveness of the proposed method. Implementation is available at https://gitlab.lrz.de/ai4eo/cd/-/tree/main/unsupContrastiveSemanticSeg.
Author supplied keywords
Cite
CITATION STYLE
Saha, S., Shahzad, M., Mou, L., Song, Q., & Zhu, X. X. (2022). Unsupervised Single-Scene Semantic Segmentation for Earth Observation. IEEE Transactions on Geoscience and Remote Sensing, 60. https://doi.org/10.1109/TGRS.2022.3174651
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.