The average person with a networked computer can now understand why computers should have vision — to search the world's collections of digital video and images and “retrieve a picture of—.” Computer vision for intelligent browsing, querying, and retrieval of imagery is needed now, and yet traditional approaches to computer vision remain far from a general solution to the scene understanding problem. In this paper I discuss the need for a solution based on combining high-level and low-level vision, that works in concert with input from a human user. The solution is based on: 1) Learning from the user what is important visually, and 2) Learning associations between text descriptions and visual data. I describe some recent results in these areas, and overview key challenges for future research in computer vision for digital libraries.
CITATION STYLE
Picard, R. W. (1996). Digital libraries: Meeting place for high-level and low-level vision. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1035, pp. 1–12). Springer Verlag. https://doi.org/10.1007/3-540-60793-5_57
Mendeley helps you to discover research relevant for your work.