Efficient and effective Querying by Image Content
In the QBIC (Query By Image Content) project we are studying methods to query large on-line image databases using the images' content as the basis of the queries. Examples of the content we use include color, texture, shape, position, and dominant edges of image objects and regions. Potential applications include medical (``Give me other images that contain a tumor with a texture like this one''), photo-journalism (``Give me images that have blue at the top and red at the bottom''), and many others in art, fashion, cataloging, retailing, and industry. We describe a set of novel features and similarity measures allowing query by image content, together with the QBIC system we implemented. We demonstrate the effectiveness of our system with normalized precision and recall experiments on test databases containing over 1000 images and 1000 objects populated from commercially available photo clip art images, and of images of airplane silhouettes. We also present new methods for efficient processing of QBIC queries that consist of filtering and indexing steps. We specifically address two problems: (a) non Euclidean distance measures; and (b) the high dimensionality of feature vectors. For the first problem, we introduce a new theorem that makes efficient filtering possible by bounding the non-Euclidean, full cross-term quadratic distance expression with a simple Euclidean distance. For the second, we illustrate how orthogonal transforms, such as Karhunen Loeve, can help reduce the dimensionality of the search space. Our methods are general and allow some ``false hits'' but no false dismissals. The resulting QBIC system offers effective retrieval using image content, and for large image databases significant speedup over straightforward indexing alternatives. The system is implemented in X/Motif and C running on an RS/6000.