Data Extraction from the World Wide Web is a well known, unsolved, and critical problem when complex information systems are designed. These problems are related to the extraction, management and reuse of the huge amount ofWeb data available. These data usually has a high heterogeneity, volatility and low quality (i.e. format and content mistakes), so it is quite hard to build reliable systems. This chapter proposes an Evolutionary Computation approach to the problem of automatically learn software entities based on Genetic Algorithms and regular expressions. These entities, also called wrappers, will be able to extract some kind of Web data structures from examples. © 2009 Springer-Verlag US.
CITATION STYLE
Barrero, D. F., Camacho, D., & R-Moreno, M. D. (2009). Automatic web data extraction based on genetic algorithms and regular expressions. In Data Mining and Multi-Agent Integration (pp. 143–154). Springer US. https://doi.org/10.1007/978-1-4419-0522-2_9
Mendeley helps you to discover research relevant for your work.