This chapter presents a novel approach to keyword search in Information Retrieval based on Tolerance Rough Set Model (TRSM). Bag-of-word representation of each document is extended by additional words that are enclosed into inverted index along with appropriate weights. Those extension words are derived from different techniques (e.g. semantic information, word distribution, etc.) that are encapsulated in the model by a tolerance relation. Weight for structural extension are then assigned by unsupervised algorithm. This method, called TRSM-WL, allow us to improve retrieval effectiveness by returning documents that not necessarily include words from the query.We compare performance of these two algorithms in the keyword search problem over a benchmark data set.
CITATION STYLE
Świeboda, W., Meina, M., & Nguyen, H. S. (2014). Weight learning in TRSM-based information retrieval. Studies in Computational Intelligence, 541, 61–74. https://doi.org/10.1007/978-3-319-04714-0_5
Mendeley helps you to discover research relevant for your work.