Automatic genre detection of web documents

10Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A genre or a style is another view of documents different from a subject or a topic. The genre is also a criterion to classify the documents. There have been several studies on detecting a genre of textual documents. However, only a few of them dealt with web documents. In this paper we suggest sets of features to detect genres of web documents. Web documents are different from textual documents in that they contain URL and HTML tags within the pages. We introduce the features specific to web documents, which are extracted from URL and HTML tags. Experimental results enable us to evaluate their characteristics and performances. © Springer-Verlag Berlin Heidelberg 2005.

Cite

CITATION STYLE

APA

Lim, C. S., Lee, K. J., & Kim, G. C. (2005). Automatic genre detection of web documents. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3248, pp. 310–319). Springer Verlag. https://doi.org/10.1007/978-3-540-30211-7_33

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free