Abstract
Traditional approaches to information retrieval are based upon representing a user's query as a bag of query terms and a document as a bag of index terms and computing a degree of similarity between the two based on the overlap or number of query terms in common between them. Our long-term approach to IR applications is based upon pre-computing semantically-based word-word similarities, work which is described elsewhere, and using these as part of the document-query similarity measure. A basic premise of our word-to-word similarity measure is that the input to this computation is the correct or intended word sense but in information retrieval applications, automatic and accurate word sense disambiguation remains an unsolved problem. In this paper we describe our first successful application of these ideas to an information retrieval application, specifically the indexing and retrieval of captions describing the content of images. We have hand-captioned 2714 images and to circumvent, for the time being, the problems raised by word sense disambiguation, we manually disambiguated polysemous words in captions. We have also built a collection of 60 queries and for each, determined relevance assessments. Using this environment we were able to run experiments in which we varied how the query-caption similarity measure used our pre-computed word-word semantic distances. Our experiments, reported in the paper, show significant improvement for this environment over the more traditional approaches to information retrieval.
Cite
CITATION STYLE
Smeaton, A. F., & Quigley, I. (1996). Experiments on using semantic distances between words in image caption retrieval. In SIGIR Forum (ACM Special Interest Group on Information Retrieval) (pp. 174–180). https://doi.org/10.1145/243199.243261
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.