Online selection of discriminative tracking features
- ISSN: 01628828
- DOI: 10.1109/TPAMI.2005.205
- PubMed: 16237997
Abstract
This paper presents an online feature selection mechanism for evaluating multiple features while tracking and adjusting the set of features used to improve tracking performance. Our hypothesis is that the features that best discriminate between object and background are also best for tracking the object. Given a set of seed features, we compute log likelihood ratios of class conditional sample densities from object and background to form a new set of candidate features tailored to the local object/background discrimination task. The two-class variance ratio is used to rank these new features according to how well they separate sample distributions of object and background pixels. This feature evaluation mechanism is embedded in a mean-shift tracking system that adaptively selects the top-ranked discriminative features for tracking. Examples are presented that demonstrate how this method adapts to changing appearances of both tracked object and scene background. We note susceptibility of the variance ratio feature selection method to distraction by spatially correlated background clutter and develop an additional approach that seeks to minimize the likelihood of distraction.
Author-supplied keywords
Online selection of discriminative tracking features
Tracking Features
Robert T. Collins, Senior Member, IEEE, Yanxi Liu, Senior Member, IEEE, and Marius Leordeanu
Abstract—This paper presents an online feature selection mechanism for evaluating multiple features while tracking and adjusting the
set of features used to improve tracking performance. Our hypothesis is that the features that best discriminate between object and
background are also best for tracking the object. Given a set of seed features, we compute log likelihood ratios of class conditional
sample densities from object and background to form a new set of candidate features tailored to the local object/background
discrimination task. The two-class variance ratio is used to rank these new features according to how well they separate sample
distributions of object and background pixels. This feature evaluation mechanism is embedded in a mean-shift tracking system that
adaptively selects the top-ranked discriminative features for tracking. Examples are presented that demonstrate how this method
adapts to changing appearances of both tracked object and scene background. We note susceptibility of the variance ratio feature
selection method to distraction by spatially correlated background clutter and develop an additional approach that seeks to minimize
the likelihood of distraction.
Index Terms—Computer vision, tracking, time-varying imagery, feature creation, feature evaluation and selection.
1 INTRODUCTION
TWOdecades of vision research have yielded an arsenal ofpowerful algorithms for object tracking. Multiple mov-
ing objects can be effectively tracked in real-time from
stationary cameras using frame differencing or adaptive
background subtraction combined with simple data asso-
ciation techniques [1], [6], [26]. These detect-then-track
approaches can be generalized to situations where apparent
camera motion is easily stabilized, including purely rotating
and zooming cameras and aerial views where scene
structure is approximately planar [14]. Modern appear-
ance-based methods use gradient descent to incrementally
follow a reference object model through video without prior
knowledge of scene structure or camera motion. This
includes the use of flexible template models [8], [21] and
kernel-based methods that track nonrigid objects using
viewpoint-insensitive histograms [7], [10]. Kalman filter
extensions achieve more robust tracking of maneuvering
objects by introducing statistical models of object and
camera motion [3], [16]. Tracking through occlusion and
clutter is achieved by reasoning over a state-space of
multiple hypotheses [15], [23], [24].
Our experience with a variety of tracking methods can be
summarized simply: Tracking success or failure depends
primarily on how distinguishable an object is from its
surroundings. If the object is very distinctive, we can use a
simple tracker to follow it. If the object has low-contrast or is
camouflaged, we will obtain robust tracking only by
imposing prior knowledge about scene structure or
expected motion, thus buying tracking success at the price
of reduced generality.
The degree to which a tracker can discriminate object
and background is directly related to the image features
used. Surprisingly, most tracking applications are con-
ducted using a fixed set of features, determined a priori.
Sometimes, preliminary experiments are run to determine
which fixed features to use—a good example is work on
head tracking using skin color, where many papers evaluate
different color spaces to find one in which pixel values for
skin cluster most tightly, e.g., [30]. However, these
approaches ignore the fact that it is the ability to distinguish
between object and background that is most important and
the background cannot always be specified in advance.
Furthermore, both foreground and background appearance
will change as the target object moves from place to place,
so tracking features also need to adapt. Fig. 1 illustrates this
observation with low contrast imagery of a car traveling
through patches of sunlight and shadow. The best feature
for tracking the car through sunlight performs poorly in
shadow and vice versa.
A key issue addressed in this work is online, adaptive
selection of appropriate features for tracking. Target
tracking is cast as a local discrimination problem with
two classes: foreground and background. Our insight is that
the features that best distinguish between object and
background are the best features for tracking. This point
of view opens up a wide range of pattern recognition
feature selection techniques that can be adapted for use in
tracking. An interesting characteristic of target tracking is
that foreground and background appearances are con-
stantly changing, albeit gradually. Naturally, when class
appearance varies, the most discriminating set of features
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 27, NO. 10, OCTOBER 2005 1631
. R.T. Collins is with the Computer Science Engineering Department, The
Pennsylvania State University, University Park, PA 16802.
E-mail: rcollins@cse.psu.edu.
. Y. Liu is with The Robotics Institute and the Center for Automated
Learning and Discovery, Carnegie Mellon University, 5000 Forbes Ave.,
Pittsburgh, PA 15213. E-mail: yanxi@cs.cmu.edu
. M. Leordeanu is with The Robotics Institute, Carnegie Mellon University,
5000 Forbes Ave., Pittsburgh, PA 15213.
E-mail: mleordea@andrew.cmu.edu.
Manuscript received 9 July 2004; revised 14 Dec. 2004; accepted 19 Jan. 2005;
published online 11 Aug. 2005.
Recommended for acceptance by C. Taylor.
For information on obtaining reprints of this article, please send e-mail to:
tpami@computer.org, and reference IEEECS Log Number TPAMI-0343-0704.
0162-8828/05/$20.00 2005 IEEE Published by the IEEE Computer Society
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime


