Enhanced web crawler design to classify web documents using contextual metadata

2Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

World Wide Web (WWW) is a biggest place of information repository in the Universe. A common man often seeks the assistance of Web for gathering information to enrich and enhance the knowledge of his interest to become an expert in his/her field. More often than not, search engines come in handy to provide information to the user. The nightmare of the search engines relies on the relevancy of the result-set presented to the user. To provide more relevant results, most of the search engines will have Web crawler in its armory as an important component to index the Web pages. Web crawlers (also called Web Spiders or Robots) are programs used to download documents from the internet. A focused crawler is a specialized crawler which will search for and index the Web page of a particular topic, thus reducing the amount of network traffic and download. This paper determines and identified a set of factors to determine the relevancy of Web documents and introduces a Contextual metadata framework to summarize the captured relevancy data that can be used to categorize and sort results and in essence to improve the quality of the result-set presented to the end user.

Cite

CITATION STYLE

APA

Rajesh, L., Shanthi, V., & Varadhan, V. (2015). Enhanced web crawler design to classify web documents using contextual metadata. In Advances in Intelligent Systems and Computing (Vol. 336, pp. 509–516). Springer Verlag. https://doi.org/10.1007/978-81-322-2220-0_42

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free