Mining HTML pages to support document sharing in a cooperative system

Donato Malerba; Floriana Esposito; Michelangelo Ceci

Conference Proceedings

Mining HTML pages to support document sharing in a cooperative system

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2002) 2490 LNCS 420-434

DOI: 10.1007/3-540-36128-6_25

5Citations

4Readers

Get full text

Abstract

In this paper, the problem of classifying HTML documents is investigated in the context of a client-server application, named WebClass, developed to support the search activity of a geographically distributed group of people with common interests. The two main issues studied in the paper are the selection of some features to represent HTML documents and the construction of the classifiers. A new feature selection technique is presented and its interaction with different classifiers is experimentally studied. Results show that performance improves even with simple classifiers and the proposed feature selection technique compares favorably with respect to other well-known approaches. © Springer-Verlag Berlin Heidelberg 2002.

Author supplied keywords

Cite

CITATION STYLE

APA

Malerba, D., Esposito, F., & Ceci, M. (2002). Mining HTML pages to support document sharing in a cooperative system. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2490 LNCS, pp. 420–434). Springer Verlag. https://doi.org/10.1007/3-540-36128-6_25

Mining HTML pages to support document sharing in a cooperative system

Abstract

Author supplied keywords

Cite

Register to see more suggestions