A new webpage classification model based on visual information using gestalt laws of grouping

5Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Traditional text-based webpage classification fails to handle rich-information-embedded modern webpages. Current approaches regard webpages as either trees or images. However, the former only focuses on webpage structure, and the latter ignores internal connections among different webpage features. Therefore, they are not suitable for modern webpage classification. Hence, semantic-block trees are introduced as a new representation for webpages. They are constructed by extracting visual information from webpages, integrating the visual information into render-blocks, and merging render-blocks using the Gestalt laws of grouping. The block tree edit distance is then described to evaluate both structural and visual similarity of pages. Using this distance as a metric, a classification framework is proposed to classify webpages based upon their similarity.

Cite

CITATION STYLE

APA

Xu, Z., & Miller, J. (2015). A new webpage classification model based on visual information using gestalt laws of grouping. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9419, pp. 225–232). Springer Verlag. https://doi.org/10.1007/978-3-319-26187-4_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free