Clustering visually similar web page elements for structured web data extraction

5Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

We propose a novel approach for extraction of structured web data called ClustVX. It clusters visually similar web page elements by exploiting their visual formatting and structural features. Clusters are then used to derive extraction rules. The experimental evaluation results of ClustVX system on three publicly available benchmark data sets outperform state-of-the-art structured data extraction systems. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Grigalis, T., Radvilavičius, L., Čenys, A., & Gordevičius, J. (2012). Clustering visually similar web page elements for structured web data extraction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7387 LNCS, pp. 435–438). https://doi.org/10.1007/978-3-642-31753-8_38

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free