Information extraction via path merging

Robert Dale; Cecile Paris; Marc Tilbrook

Conference Proceedings

Information extraction via path merging

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2003) 2903 150-160

DOI: 10.1007/978-3-540-24581-0_13

1Citations

4Readers

Get full text

Abstract

In this paper, we describe a new approach to information extraction that neatly integrates top-down hypothesis driven information with bottom-up data driven information. The aim of the kelp project is to combine a variety of natural language processing techniques so that we can extract useful elements of information from a collection of documents and then re-present this information in a manner that is tailored to the needs of a specific user. Our focus here is on how we can build richly structured data objects by extracting information from web pages; as an example, we describe our methods in the context of extracting information from webp ages that describe laptop computers. Our approach, which we call path-merging, involves using relatively simple techniques for identifying what are normally referred to as named entities, then allowing more sophisticated and intelligent techniques to combine these elements of information: effectively, we view the text as providing a collection of jigsaw-piece-like elements of information which then have to be combined to produce a representation of the useful content of the document. A principle goal of this work is the separation of different components of the information extraction task so as to increase portability.

Author supplied keywords

Cite

CITATION STYLE

APA

Dale, R., Paris, C., & Tilbrook, M. (2003). Information extraction via path merging. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2903, pp. 150–160). Springer Verlag. https://doi.org/10.1007/978-3-540-24581-0_13

Information extraction via path merging

Abstract

Author supplied keywords

Cite

Register to see more suggestions