Using metadata to enhance web information gathering

2Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

With the web at close to a billion pages and growing at anexponential rate, we are faced with the issue of rating pages in termsof quality and trust. In this situation, what other pages say about aweb page can be as important as what the page says about itself. The cumulative knowledge of these types of recommendations (or the lack there of) can be objective enough to help a user or robot program todecide whether or not to pursue a web document. In addition, these annotations or metadata can be used by a web robot program to derivesummary information about web documents that are written in a language that the robot does not understand. We use this idea to drive a web information gathering system that forms the core of a topic-speciffcsearch engine.In this paper, we describe how our system uses metadata about the hyperlinks to guide itself to crawl the web. It sifts through useful information related to a particular topic to eliminate the traversal of links thatmay not be of interest. Thus, the guided crawling system stays focusedon the target topic. It builds a rich repository of link information that includes metadata. This repository ultimately serves a search engine.

Cite

CITATION STYLE

APA

Yi, J., Sundaresan, N., & Huang, A. (2001). Using metadata to enhance web information gathering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1997, pp. 38–57). Springer Verlag. https://doi.org/10.1007/3-540-45271-0_3

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free