It has been observed that short queries generally have better performance than their corresponding long versions when retrieved by the same IR model. This is mainly because most of the current models do not distinguish the importance of different terms in the query. Observed that sentence-like queries encode information related to the term importance in the grammatical structure, we propose a Hidden Markov Model (HMM) based method to extract such information to do term weighting. The basic idea of choosing HMM is motivated by its successful application in capturing the relationship between adjacent terms in NLP field. Since we are dealing with queries of natural language form, we think that HMM can also be used to capture the dependence between the weights and the grammatical structures. Our experiments show that our assumption is quite reasonable and that such information, when utilized properly, can greatly improve retrieval performance. © 2012 Springer-Verlag.
CITATION STYLE
Yan, X., Gao, G., Su, X., Wei, H., Zhang, X., & Lu, Q. (2012). Hidden Markov model for term weighting in verbose queries. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7488 LNCS, pp. 82–87). https://doi.org/10.1007/978-3-642-33247-0_10
Mendeley helps you to discover research relevant for your work.