DOM semantic expansion-based extraction of topical information from web pages

Junjie Chen; Junyao Jia; Liguo Duan

Conference Proceedings

DOM semantic expansion-based extraction of topical information from web pages

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6988 LNCS(PART 2) 343-350

DOI: 10.1007/978-3-642-23982-3_42

2Citations

1Readers

Get full text

Abstract

Web pages usually contain much irrelevant information that customers don't need. Thus, in order to extract relevant information from the complicated information heap, effective methods to extract information are required. Aiming at the semi-structured characteristic of HTML, theme-relevant information in web pages could be extracted by semantic pruning, in the adoption of DOM-presentation, combined with the feature of web structure and the fuzzy classification of keywords. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Chen, J., Jia, J., & Duan, L. (2011). DOM semantic expansion-based extraction of topical information from web pages. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6988 LNCS, pp. 343–350). https://doi.org/10.1007/978-3-642-23982-3_42

DOM semantic expansion-based extraction of topical information from web pages

Abstract

Author supplied keywords

Cite

Register to see more suggestions