The Lixto project: Exploring new frontiers of web data extraction

Julien Carme; Michal Ceresna; Oliver Frölich; Georg Gottlob; Tamir Hassan; Marcus Herzog; Wolfgang Holzinger; Bernhard Krüpl

Conference Proceedings

The Lixto project: Exploring new frontiers of web data extraction

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4042 LNCS 1-15

DOI: 10.1007/11788911_1

10Citations

14Readers

Get full text

Abstract

The Lixto project is an ongoing research effort in the area of Web data extraction. Whereas the project originally started out with the idea to develop a logic-baaed extraction language and a tool to visually define extraction programs from sample Web pages, the scope of the project has been extended over time. Today, new issues auch as employing learning algorithms for the definition of extraction programs, automatically extracting data from Web pages featuring a table-centric visual appearance, and extracting from alternative document formats such as PDF are being investigated. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Carme, J., Ceresna, M., Frölich, O., Gottlob, G., Hassan, T., Herzog, M., … Krüpl, B. (2006). The Lixto project: Exploring new frontiers of web data extraction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4042 LNCS, pp. 1–15). Springer Verlag. https://doi.org/10.1007/11788911_1

The Lixto project: Exploring new frontiers of web data extraction

Abstract

Cite

Register to see more suggestions