Automatic web data extraction based on genetic algorithms and regular expressions

16Citations
Citations of this article
21Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Data Extraction from the World Wide Web is a well known, unsolved, and critical problem when complex information systems are designed. These problems are related to the extraction, management and reuse of the huge amount ofWeb data available. These data usually has a high heterogeneity, volatility and low quality (i.e. format and content mistakes), so it is quite hard to build reliable systems. This chapter proposes an Evolutionary Computation approach to the problem of automatically learn software entities based on Genetic Algorithms and regular expressions. These entities, also called wrappers, will be able to extract some kind of Web data structures from examples. © 2009 Springer-Verlag US.

Cite

CITATION STYLE

APA

Barrero, D. F., Camacho, D., & R-Moreno, M. D. (2009). Automatic web data extraction based on genetic algorithms and regular expressions. In Data Mining and Multi-Agent Integration (pp. 143–154). Springer US. https://doi.org/10.1007/978-1-4419-0522-2_9

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free