Transformer has shown excellent performance in remote sensing field with long-range modeling capabilities. Remote sensing video (RSV) moving object detection and tracking play indispensable roles in military activities as well as urban monitoring. However, transformers in these fields are still at the exploratory stage. In this survey, we comprehensively summarize the research prospects of transformers in RSV moving object detection and tracking. The core designs of remote sensing transformers and advanced transformers are first analyzed. It mainly includes the attention mechanism evolution for specific tasks, the fitting ability design of input mapping, diverse feature representation, model optimization, etc. The architectural characteristics of RSV detection and tracking are then described across two aspects. One is moving object detection for motion-based traditional background subtractions and appearance-based deep learning models. The other is object tracking for single and multiple targets. The research difficulties mainly include the blurred foreground in RSV data, the irregular object movement in traditional background subtraction, and the severe object occlusion in object tracking. Following that, the potential significance of transformers is discussed according to some thorny problems in RSV. Finally, we summarize ten open challenges of transformers in RSV, which may be used as a reference for promoting future research.
CITATION STYLE
Jiao, L., Zhang, X., Liu, X., Liu, F., Yang, S., Ma, W., … Zhang, J. (2023). Transformer Meets Remote Sensing Video Detection and Tracking: A Comprehensive Survey. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 16. https://doi.org/10.1109/JSTARS.2023.3289293
Mendeley helps you to discover research relevant for your work.