Low-cost supervision for multiple-source attribute extraction

Joseph Reisinger; Marius Paşca

Conference Proceedings

Low-cost supervision for multiple-source attribute extraction

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5449 LNCS 382-393

DOI: 10.1007/978-3-642-00382-0_31

8Citations

12Readers

Get full text

Abstract

Previous studies on extracting class attributes from unstructured text consider either Web documents or query logs as the source of textual data. Web search queries have been shown to yield attributes of higher quality. However, since many relevant attributes found in Web documents occur infrequently in query logs, Web documents remain an important source for extraction. In this paper, we introduce Bootstrapped Web Search (BWS) extraction, the first approach to extracting class attributes simultaneously from both sources. Extraction is guided by a small set of seed attributes and does not rely on further domainspecific knowledge. BWS is shown to improve extraction precision and also to improve attribute relevance across 40 test classes. © Springer-Verlag Berlin Heidelberg 2009.

Cite

CITATION STYLE

APA

Reisinger, J., & Paşca, M. (2009). Low-cost supervision for multiple-source attribute extraction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5449 LNCS, pp. 382–393). https://doi.org/10.1007/978-3-642-00382-0_31

Low-cost supervision for multiple-source attribute extraction

Abstract

Cite

Register to see more suggestions