Enhanced web crawler design to classify web documents using contextual metadata

L. Rajesh; V. Shanthi; V. Varadhan

Conference Proceedings

Enhanced web crawler design to classify web documents using contextual metadata

Advances in Intelligent Systems and Computing (2015) 336 509-516

DOI: 10.1007/978-81-322-2220-0_42

2Citations

6Readers

Get full text

Abstract

World Wide Web (WWW) is a biggest place of information repository in the Universe. A common man often seeks the assistance of Web for gathering information to enrich and enhance the knowledge of his interest to become an expert in his/her field. More often than not, search engines come in handy to provide information to the user. The nightmare of the search engines relies on the relevancy of the result-set presented to the user. To provide more relevant results, most of the search engines will have Web crawler in its armory as an important component to index the Web pages. Web crawlers (also called Web Spiders or Robots) are programs used to download documents from the internet. A focused crawler is a specialized crawler which will search for and index the Web page of a particular topic, thus reducing the amount of network traffic and download. This paper determines and identified a set of factors to determine the relevancy of Web documents and introduces a Contextual metadata framework to summarize the captured relevancy data that can be used to categorize and sort results and in essence to improve the quality of the result-set presented to the end user.

Author supplied keywords

Cite

CITATION STYLE

APA

Rajesh, L., Shanthi, V., & Varadhan, V. (2015). Enhanced web crawler design to classify web documents using contextual metadata. In Advances in Intelligent Systems and Computing (Vol. 336, pp. 509–516). Springer Verlag. https://doi.org/10.1007/978-81-322-2220-0_42

Enhanced web crawler design to classify web documents using contextual metadata

Abstract

Author supplied keywords

Cite

Register to see more suggestions