Portable extraction of partially structured facts from the web

3Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A novel fact extraction task is defined to fill a gap between current information retrieval and information extraction technologies. It is shown that it is possible to extract useful partially structured facts about different kinds of entities in a broad domain, i.e. all kinds of places depicted in tourist images. Importantly the approach does not rely on existing linguistic resources (gazetteers, taggers, parsers, etc.) and it ported easily and cheaply between two rather different languages (English and Latvian). Previous fact extraction from the web has focused on the extraction of structured data, e.g. (Building-LocatedIn-Town). In contrast we extract richer and more interesting facts, such as a fact explaining why a building was built. Enough structure is maintained to facilitate subsequent processing of the information. For example, the partial structure enables straightforward template-based text generation. We report positive results for the correctness and interest of English and Latvian facts and for their utility in enhancing image captions. © 2010 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Salway, A., Kelly, L., Skadiņa, I., & Jones, G. J. F. (2010). Portable extraction of partially structured facts from the web. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6233 LNAI, pp. 345–356). https://doi.org/10.1007/978-3-642-14770-8_38

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free