Automated web data mining using semantic analysis

Wenxiang Dou; Jinglu Hu

Conference Proceedings

Automated web data mining using semantic analysis

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7713 LNAI 539-551

DOI: 10.1007/978-3-642-35527-1_45

2Citations

6Readers

Get full text

Abstract

This paper presents an automated approach to extracting product data from commercial web pages. Our web mining method involves the following two phrases: First, it analyzes the data information located at the leaf node of DOM tree structure of the web page, generates the semantic information vector for other nodes of the DOM tree and find maximum repeat semantic vector pattern. Second, it identifies the product data region and data records, builds a product object template by using semantic tree matching technique and uses it to extract all product data from the web page. The main contribution of this study is in developing a fully automated approach to extract product data from the commercial sites without any user's assistance. Experiment results show that the proposed technique is highly effective. © Springer-Verlag 2012.

Author supplied keywords

Cite

CITATION STYLE

APA

Dou, W., & Hu, J. (2012). Automated web data mining using semantic analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7713 LNAI, pp. 539–551). https://doi.org/10.1007/978-3-642-35527-1_45

Automated web data mining using semantic analysis

Abstract

Author supplied keywords

Cite

Register to see more suggestions