Towards a method for unsupervised web information extraction

Hassan A. Sleiman; Rafael Corchuelo

Conference ProceedingsOPEN ACCESS

Towards a method for unsupervised web information extraction

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7387 LNCS 427-430

DOI: 10.1007/978-3-642-31753-8_36

2Citations

3Readers

Abstract

The literature provides a variety of techniques to build the information extractors on which some data integration systems rely. Information extraction techniques are usually based on extraction rules that require maintenance and adaptation if web sources change. We present our preliminary steps towards an unsupervised information extraction technique that searches web documents for shared patterns and fragments them until finding the relevant information that should be extracted. Experimental results on 1230 real-web documents demonstrate that our system performs fast and achieves promising results. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Sleiman, H. A., & Corchuelo, R. (2012). Towards a method for unsupervised web information extraction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7387 LNCS, pp. 427–430). https://doi.org/10.1007/978-3-642-31753-8_36

Towards a method for unsupervised web information extraction

Abstract

Author supplied keywords

Cite

Register to see more suggestions