During the last decades, many language models approaches have been proposed to alleviate the assumption of single term independency in documents. This assumption leads to two known problems in information retrieval, namely polysemy and synonymy. In this paper, we propose a new language model based on concepts, to answer the polysemy issue, and semantic dependencies, to handle the synonymy problem. Our purpose is to relax the independency constraint by representing documents and queries by their concepts instead of single words. We consider that a concept could be a single word, a frequent collocation in the corpus or an ontology entry. In addition, semantic dependencies between query and document concepts have been incorporated into our model using a semantic smoothing technique. This allows retrieving not only documents containing the same words with the query but also documents dealing with the same concepts. Experiments carried out on TREC collections showed that our model achieves significant results compared to a strong single term based model, namely uni-gram language model.
CITATION STYLE
Lhadj, L. S., Amrouche, K., & Boughanem, M. (2014). Leveraging concepts and semantic relationships for language model based document retrieval. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8748, 100–112. https://doi.org/10.1007/978-3-319-11587-0_11
Mendeley helps you to discover research relevant for your work.