DOM semantic expansion-based extraction of topical information from web pages

2Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Web pages usually contain much irrelevant information that customers don't need. Thus, in order to extract relevant information from the complicated information heap, effective methods to extract information are required. Aiming at the semi-structured characteristic of HTML, theme-relevant information in web pages could be extracted by semantic pruning, in the adoption of DOM-presentation, combined with the feature of web structure and the fuzzy classification of keywords. © 2011 Springer-Verlag.

Cite

CITATION STYLE

APA

Chen, J., Jia, J., & Duan, L. (2011). DOM semantic expansion-based extraction of topical information from web pages. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6988 LNCS, pp. 343–350). https://doi.org/10.1007/978-3-642-23982-3_42

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free