AiATrack: Attention in Attention for Transformer Visual Tracking

Shenyuan Gao; Chunluan Zhou; Chao Ma; Xinggang Wang; Junsong Yuan

Conference Proceedings

AiATrack: Attention in Attention for Transformer Visual Tracking

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2022) 13682 LNCS 146-164

DOI: 10.1007/978-3-031-20047-2_9

99Citations

50Readers

Get full text

Abstract

Transformer trackers have achieved impressive advancements recently, where the attention mechanism plays an important role. However, the independent correlation computation in the attention mechanism could result in noisy and ambiguous attention weights, which inhibits further performance improvement. To address this issue, we propose an attention in attention (AiA) module, which enhances appropriate correlations and suppresses erroneous ones by seeking consensus among all correlation vectors. Our AiA module can be readily applied to both self-attention blocks and cross-attention blocks to facilitate feature aggregation and information propagation for visual tracking. Moreover, we propose a streamlined Transformer tracking framework, dubbed AiATrack, by introducing efficient feature reuse and target-background embeddings to make full use of temporal references. Experiments show that our tracker achieves state-of-the-art performance on six tracking benchmarks while running at a real-time speed. Code and models are publicly available at https://github.com/Little-Podi/AiATrack.

Author supplied keywords

Cite

CITATION STYLE

APA

Gao, S., Zhou, C., Ma, C., Wang, X., & Yuan, J. (2022). AiATrack: Attention in Attention for Transformer Visual Tracking. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13682 LNCS, pp. 146–164). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-20047-2_9

AiATrack: Attention in Attention for Transformer Visual Tracking

Abstract

Author supplied keywords

Cite

Register to see more suggestions