This paper presents an approach to designing and implementing extensible computational models for perceiving systems based on a knowledge-driven joint inference approach. These models can integrate different sources of information both horizontally (multi-modal and temporal fusion) and vertically (bottom–up, top–down) by incorporating prior hierarchical knowledge expressed as an extensible ontology.Two implementations of this approach are presented. The first consists of a content-based image retrieval system that allows users to search image databases using an ontological query language. Queries are parsed using a probabilistic grammar and Bayesian networks to map high-level concepts onto low-level image descriptors, thereby bridging the ‘semantic gap’ between users and the retrieval system. The second application extends the notion of ontological languages to video event detection. It is shown how effective high-level state and event recognition mechanisms can be learned from a set of annotated training sequences by incorporating syntactic and semantic constraints represented by an ontology.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below