Web-site boundary detection

5Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Defining the boundaries of a web-site, for (say) archiving or information retrieval purposes, is an important but complicated task. In this paper a web-page clustering approach to boundary detection is suggested. The principal issue is feature selection, hampered by the observation that there is no clear understanding of what a web-site is. This paper proposes a definition of a web-site, founded on the principle of user intention, directed at the boundary detection problem; and then reports on a sequence of experiments, using a number of clustering techniques, and a wide range of features and combinations of features to identify web-site boundaries. The preliminary results reported seem to indicate that, in general, a combination of features produces the most appropriate result. © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Alshukri, A., Coenen, F., & Zito, M. (2010). Web-site boundary detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6171 LNAI, pp. 529–543). https://doi.org/10.1007/978-3-642-14400-4_41

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free