Due to the semantic gap, the automatic interpretation of images is an intricate task. In this chapter, we have presented a bottom-up approach for the segmentation and interpretation of outdoor scenery images. We established a link between the proposed low-level, biological features and some predefined semantic concepts by applying a SOM for classification. Our method generally consists of two stages. At first, color opponent values and textures are extracted from the image\u27s pixels. Color opponent values induce comparable classification results as colors from the HSI space. Since the transformation of RGB into color opponent values is computationally less expensive than the transformation into HSI, the former are preferred over the latter. The texture features consist of enhanced grating cell features and smoothed Gabor responses and correspond to outputs of cells found in the primary visual cortex of primates and humans. Analogously to the processing principles of the auditory and visual cortex, Self-Organizing Maps are used for the unsupervised segmentation and labeling of textured images. Even using small-sized maps, high precision image segmentations can be obtained (both on gray-scale and on natural color textures). By adding color information, the precision of the segmentation results averagely increased with 5% to a total of 91% of the pixels. In the next stage, the same features are used to train a SelfOrganizing Map with textures belonging to one of the 5 predefined classes: (i) grass, (ii) bricks, (iii) branches, (iv) water, and (v) sky. This map is then used to label the previously obtained image segments. Experiments conducted on randomly collected images from the World Wide Web achieved a precision of 89%. The latter observations indicate that the application of biologically inspired features is very useful for scene interpretation and categorization. We further believe that the classification can be improved by (i) a more accurate segmentation, (ii) a larger, more representative training set, and by (iii) introducing high-level domain knowledge (i.e. a top-down approach). These aspects will be thoroughly investigated. In order to recognize more concepts, we have experienced that when extra texture classes are added to the training set, the number of misclassifications drastically increases. A solution to this problem might be the introduction of hierarchical or treestructured Self-Organizing Maps (Koikkalainen & Oja, 1990). Nodes containing two (or more) classes are, in a next stage, split up into different clusters what results in the separation of the related concepts. Furthermore, since the visual cortex contains different types of (complex) cells which are tuned to a specific task (e.g., for the detection of edges), we believe that they can also play an important role in image understanding
CITATION STYLE
Martens, G., Lambert, P., & de Walle, R. V. (2010). Bridging the Semantic Gap using Human Vision System Inspired Features. In Self-Organizing Maps. InTech. https://doi.org/10.5772/9179
Mendeley helps you to discover research relevant for your work.