Low-cost supervision for multiple-source attribute extraction

8Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Previous studies on extracting class attributes from unstructured text consider either Web documents or query logs as the source of textual data. Web search queries have been shown to yield attributes of higher quality. However, since many relevant attributes found in Web documents occur infrequently in query logs, Web documents remain an important source for extraction. In this paper, we introduce Bootstrapped Web Search (BWS) extraction, the first approach to extracting class attributes simultaneously from both sources. Extraction is guided by a small set of seed attributes and does not rely on further domainspecific knowledge. BWS is shown to improve extraction precision and also to improve attribute relevance across 40 test classes. © Springer-Verlag Berlin Heidelberg 2009.

Cite

CITATION STYLE

APA

Reisinger, J., & Paşca, M. (2009). Low-cost supervision for multiple-source attribute extraction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5449 LNCS, pp. 382–393). https://doi.org/10.1007/978-3-642-00382-0_31

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free