Extracting multilingual natural-language patterns for RDF predicates

Daniel Gerber; Axel Cyrille Ngonga Ngomo

Conference Proceedings

Extracting multilingual natural-language patterns for RDF predicates

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7603 LNAI 87-96

DOI: 10.1007/978-3-642-33876-2_10

34Citations

51Readers

Get full text

Abstract

Most knowledge sources on the Data Web were extracted from structured or semi-structured data. Thus, they encompass solely a small fraction of the information available on the document-oriented Web. In this paper, we present BOA, a bootstrapping strategy for extracting RDF from text. The idea behind BOA is to extract natural-language patterns that represent predicates found on the Data Web from unstructured data by using background knowledge from the Data Web. These patterns are then used to extract instance knowledge from natural-language text. This knowledge is finally fed back into the Data Web, therewith closing the loop. The approach followed by BOA is quasi independent of the language in which the corpus is written. We demonstrate our approach by applying it to four different corpora and two different languages. We evaluate BOA on these data sets using DBpedia as background knowledge. Our results show that we can extract several thousand new facts in one iteration with very high accuracy. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Gerber, D., & Ngomo, A. C. N. (2012). Extracting multilingual natural-language patterns for RDF predicates. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7603 LNAI, pp. 87–96). https://doi.org/10.1007/978-3-642-33876-2_10

Extracting multilingual natural-language patterns for RDF predicates

Abstract

Cite

Register to see more suggestions