Fuzzy combinations of criteria: An application to web page representation for clustering

0Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Document representation is an essential step in web page clustering. Web pages are usually written in HTML, offering useful information to select the most important features to represent them. In this paper we investigate the use of nonlinear combinations of criteria by means of a fuzzy system to find those important features. We start our research from a term weighting function called Fuzzy Combination of Criteria (fcc) that relies on term frequency, document title, emphasis and term positions in the text. Next, we analyze its drawbacks and explore the possibility of adding contextual information extracted from inlinks anchor texts, proposing an alternative way of combining criteria based on our experimental results. Finally, we apply a statistical test of significance to compare the original representation with our proposal. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Pérez García-Plaza, A., Fresno, V., & Martínez, R. (2012). Fuzzy combinations of criteria: An application to web page representation for clustering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7182 LNCS, pp. 157–168). https://doi.org/10.1007/978-3-642-28601-8_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free