Sign up & Download
Sign in

Object recognition from local scale-invariant features

by D G Lowe
Proceedings of the Seventh IEEE International Conference on Computer Vision ()

Abstract

An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially invariant to illumination changes and affine or 3D projection. These features share similar properties with neurons in inferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local geometric deformations by representing blurred image gradients in multiple orientation planes and at multiple scales. The keys are used as input to a nearest neighbor indexing method that identifies candidate object matches. Final verification of each match is achieved by finding a low residual least squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially occluded images with a computation time of under 2 seconds

Cite this document (BETA)

Available from ieeexplore.ieee.org
Page 1
hidden

Object recognition from local sca...

Object Recognition from Local Scale-Invariant Features David G. Lowe Computer Science Department University of British Columbia Vancouver, B.C., V6T 1Z4, Canada lowe@cs.ubc.ca Abstract Proc. of the International Conference on Computer Vision, Corfu (Sept. 1999) An object recognition system has been developed that uses a new class of local image features. The features are invariant to image scaling, translation, and rotation, and partially in- variant to illumination changes and affine or 3D projection. These features share similar properties with neurons in in- ferior temporal cortex that are used for object recognition in primate vision. Features are efficiently detected through a staged filtering approach that identifies stable points in scale space. Image keys are created that allow for local ge- ometric deformations by representing blurred image gradi- ents in multiple orientation planes and at multiple scales. The keys are used as input to a nearest-neighbor indexing method that identifies candidate object matches. Final veri- fication of each match is achieved by finding a low-residual least-squares solution for the unknown model parameters. Experimental results show that robust object recognition can be achieved in cluttered partially-occluded images with a computation time of under 2 seconds. 1. Introduction Object recognition in cluttered real-world scenes requires local image features that are unaffected by nearby clutter or partial occlusion. The features must be at least partially in- variant to illumination, 3D projective transforms, and com- mon object variations. On the other hand, the features must also be sufficiently distinctive to identify specific objects among many alternatives. The difficulty of the object recog- nition problem is due in large part to the lack of success in finding such image features. However, recent research on the use of dense local features (e.g., Schmid & Mohr [19]) has shown that efficient recognition can often be achieved by using local image descriptors sampled at a large number of repeatable locations. This paper presents a new method for image feature gen- eration called the Scale Invariant Feature Transform (SIFT). This approach transforms an image into a large collection of local feature vectors, each of which is invariant to image translation, scaling, and rotation, and partially invariant to illumination changes and affine or 3D projection. Previous approaches to local feature generation lacked invariance to scale and were more sensitive to projective distortion and illumination change. The SIFT features share a number of properties in common with the responses of neurons in infe- rior temporal (IT) cortex in primate vision. This paper also describes improved approaches to indexing and model ver- ification. The scale-invariant features are efficiently identified by using a staged filtering approach. The first stage identifies key locations in scale space by looking for locations that are maxima or minima of a difference-of-Gaussian function. Each point is used to generate a feature vector that describes the local image region sampled relative to its scale-space co- ordinate frame. The features achieve partial invariance to local variations, such as affine or 3D projections, by blur- ring image gradient locations. This approach is based on a model of the behavior of complex cells in the cerebral cor- tex of mammalian vision. The resulting feature vectors are called SIFT keys. In the current implementation, each im- age generates on the order of 1000 SIFT keys, a process that requires less than 1 second of computation time. The SIFT keys derived from an image are used in a nearest-neighbour approach to indexing to identify candi- date object models. Collections of keys that agree on a po- tential model pose are first identified through a Hough trans- form hash table, and then through a least-squares fit to a final estimate of model parameters. When at least 3 keys agree on the model parameters with low residual, there is strong evidence for the presence of the object. Since there may be dozens of SIFT keys in the image of a typical object, it is possible to have substantial levels of occlusion in the image and yet retain high levels of reliability. The current object models are represented as 2D loca- tions of SIFT keys that can undergo affine projection. Suf- ficient variation in feature location is allowed to recognize perspective projection of planar shapes at up to a 60 degree rotation away from the camera or to allow up to a 20 degree rotation of a 3D object. 1

Readership Statistics

1089 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
39% Ph.D. Student
 
20% Student (Master)
 
7% Student (Bachelor)
by Country
 
18% United States
 
10% Germany
 
9% United Kingdom

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in