Traditional search engines ignore the tremendous amount information “hidden” behind search forms of Web pages, in large searchable electronic databases, which is called hidden Web. In this paper, we address this problem of designing a system for extracting and retrieval hidden Web information. We present a generic operational model of the hidden Web information retrieval and describe the key techniques. We introduce a new Tag-Tree-based Object Extraction Technique for automatically extracting hidden Web information from web pages. Based on this technique, we implement the retrieval algorithm for structured query of hidden Web information. The test results have also been reported. © Springer-Verlag Berlin Heidelberg 2002.
CITATION STYLE
Hui, S., Ling, Z., Yunming, Y., & Fanyuan, M. (2002). Object-extraction-based hidden web information retrieval. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2419, 338–345. https://doi.org/10.1007/3-540-45703-8_31
Mendeley helps you to discover research relevant for your work.