Automatic Web Page Categorization by Link and Context Analysis

  • Attardi G
  • Gullì A
  • Sebastiani F
N/ACitations
Citations of this article
132Readers
Mendeley users who have this article in their library.

Abstract

Assistance in retrieving documents on the World Wide Web is provided either by search engines, through keyword-based queries, or by catalogues, which organize documents into hierarchical collections. Maintaining catalogues manually is becoming increasingly difficult, due to the sheer amount of material on the Web; it is thus becoming necessary to resort to techniques for the automatic classification of documents. Automatic classification is traditionally performed by extracting the information for representing a document (``indexing'') from the document itself. The paper describes the novel technique of categorization by context, which instead extracts useful information for classifying a document from the context where a URL referring to it appears. We present the results of experimenting with Theseus, a classifier that exploits this technique.

Cite

CITATION STYLE

APA

Attardi, G., Gullì, a, & Sebastiani, F. (1999). Automatic Web Page Categorization by Link and Context Analysis. Proceedings of THAI 99, 105–119. Retrieved from http://wortschatz.uni-leipzig.de/~sbordag/semantische/papers/07/attardi99automatic.pdf

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free