Automatic Web Page Categorization by Link and Context Analysis

Giuseppe Attardi; a Gullì; F Sebastiani

Journal Article

Automatic Web Page Categorization by Link and Context Analysis

Attardi G
Gullì A
Sebastiani F

Proceedings of THAI 99 (1999) 105-119

N/ACitations

132Readers

Abstract

Assistance in retrieving documents on the World Wide Web is provided either by search engines, through keyword-based queries, or by catalogues, which organize documents into hierarchical collections. Maintaining catalogues manually is becoming increasingly difficult, due to the sheer amount of material on the Web; it is thus becoming necessary to resort to techniques for the automatic classification of documents. Automatic classification is traditionally performed by extracting the information for representing a document (``indexing'') from the document itself. The paper describes the novel technique of categorization by context, which instead extracts useful information for classifying a document from the context where a URL referring to it appears. We present the results of experimenting with Theseus, a classifier that exploits this technique.

Cite

CITATION STYLE

APA

Attardi, G., Gullì, a, & Sebastiani, F. (1999). Automatic Web Page Categorization by Link and Context Analysis. Proceedings of THAI 99, 105–119. Retrieved from http://wortschatz.uni-leipzig.de/~sbordag/semantische/papers/07/attardi99automatic.pdf

Automatic Web Page Categorization by Link and Context Analysis

Abstract

Author supplied keywords

Cite

Register to see more suggestions