Analyzing distillation process of hidden terms in web documents for IR

Citations of this article
Mendeley users who have this article in their library.


The previous work in web based applications such as mining web content, pattern recognition and similarity measures between the web documents. This paper is about, analyzing web documents in an enhanced way and delve the distillation web document will be the next pacc in hypertext mining. The sparse document is a very little data on the web, which may facc problems like different words with almost identical or similar meanings and sparscncss. Natural language processing (NLP) and information retrieval (IR) arc the main obstacles of the above problem. The mining of hidden terms discovers the search queries from large external datascts (universal datascts). It helps to handle unseen data in a better way. The goal of this web document mining consists of an efficient information finding, filtering information based on user query, and discovers more topic focused keywords based on the rich source of global information datascts. The proposed method we use the Distillation model, it is the integration of probabilistic generative model, Gibbs sampling algorithm and deployment method. This model can be applied for different natural languages and data domains for achieving the goal. © 2012 Published by Elsevier Ltd.




Pradeepa, M., & Deisy, C. (2012). Analyzing distillation process of hidden terms in web documents for IR. In Procedia Engineering (Vol. 38, pp. 3215–3221). Elsevier Ltd.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free