In deep learning-based hyperspectral remote sensing image classification tasks, random sampling strategies are typically used to train model parameters for testing and evaluation. However, this approach leads to strong spatial autocorrelation between the training set samples and the surrounding test set samples, and some unlabeled test set data directly participate in the training of the network. This leaked information makes the model overly optimistic. Models trained under these conditions tend to overfit to a single dataset, which limits the range of practical applications. This paper analyzes the causes and effects of information leakage and summarizes the methods from existing models to mitigate the effects of information leakage. Specifically, this paper states the main issues in this area, where the issue of information leakage is addressed in detail. Second, some algorithms and related models used to mitigate information leakage are categorized, including reducing the number of training samples, using spatially disjoint sampling strategies, few-shot learning, and unsupervised learning. These models and methods are classified according to the sample-related phase and the feature extraction phase. Finally, several representative hyperspectral image classification models experiments are conducted on the common datasets and their effectiveness in mitigating information leakage is analyzed.
CITATION STYLE
Feng, H., Wang, Y., Li, Z., Zhang, N., Zhang, Y., & Gao, Y. (2023, August 1). Information Leakage in Deep Learning-Based Hyperspectral Image Classification: A Survey. Remote Sensing. Multidisciplinary Digital Publishing Institute (MDPI). https://doi.org/10.3390/rs15153793
Mendeley helps you to discover research relevant for your work.