The Lixto project: Exploring new frontiers of web data extraction

10Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The Lixto project is an ongoing research effort in the area of Web data extraction. Whereas the project originally started out with the idea to develop a logic-baaed extraction language and a tool to visually define extraction programs from sample Web pages, the scope of the project has been extended over time. Today, new issues auch as employing learning algorithms for the definition of extraction programs, automatically extracting data from Web pages featuring a table-centric visual appearance, and extracting from alternative document formats such as PDF are being investigated. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

Carme, J., Ceresna, M., Frölich, O., Gottlob, G., Hassan, T., Herzog, M., … Krüpl, B. (2006). The Lixto project: Exploring new frontiers of web data extraction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4042 LNCS, pp. 1–15). Springer Verlag. https://doi.org/10.1007/11788911_1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free