Leveraging Tacit Information Embedded in CNN Layers for Visual Tracking

0Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Different layers in CNNs provide not only different levels of abstraction for describing the objects in the input but also encode various implicit information about them. The activation patterns of different features contain valuable information about the stream of incoming images: spatial relations, temporal patterns, and co-occurrence of spatial and spatiotemporal (ST) features. The studies in visual tracking literature, so far, utilized only one of the CNN layers, a pre-fixed combination of them, or an ensemble of trackers built upon individual layers. In this study, we employ an adaptive combination of several CNN layers in a single DCF tracker to address variations of the target appearances and propose the use of style statistics on both spatial and temporal properties of the target, directly extracted from CNN layers for visual tracking. Experiments demonstrate that using the additional implicit data of CNNs significantly improves the performance of the tracker. Results demonstrate the effectiveness of using style similarity and activation consistency regularization in improving its localization and scale accuracy.

Cite

CITATION STYLE

APA

Meshgi, K., Mirzaei, M. S., & Oba, S. (2021). Leveraging Tacit Information Embedded in CNN Layers for Visual Tracking. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12623 LNCS, pp. 521–538). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-69532-3_32

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free