Automatic web data extraction based on genetic algorithms and regular expressions

David F. Barrero; David Camacho; María D. R-Moreno

Book Chapter

Automatic web data extraction based on genetic algorithms and regular expressions

Springer US, (2009), 143-154

DOI: 10.1007/978-1-4419-0522-2_9

16Citations

21Readers

Get full text

Abstract

Data Extraction from the World Wide Web is a well known, unsolved, and critical problem when complex information systems are designed. These problems are related to the extraction, management and reuse of the huge amount ofWeb data available. These data usually has a high heterogeneity, volatility and low quality (i.e. format and content mistakes), so it is quite hard to build reliable systems. This chapter proposes an Evolutionary Computation approach to the problem of automatically learn software entities based on Genetic Algorithms and regular expressions. These entities, also called wrappers, will be able to extract some kind of Web data structures from examples. © 2009 Springer-Verlag US.

Cite

CITATION STYLE

APA

Barrero, D. F., Camacho, D., & R-Moreno, M. D. (2009). Automatic web data extraction based on genetic algorithms and regular expressions. In Data Mining and Multi-Agent Integration (pp. 143–154). Springer US. https://doi.org/10.1007/978-1-4419-0522-2_9

Automatic web data extraction based on genetic algorithms and regular expressions

Abstract

Cite

Register to see more suggestions