Previous studies on extracting class attributes from unstructured text consider either Web documents or query logs as the source of textual data. Web search queries have been shown to yield attributes of higher quality. However, since many relevant attributes found in Web documents occur infrequently in query logs, Web documents remain an important source for extraction. In this paper, we introduce Bootstrapped Web Search (BWS) extraction, the first approach to extracting class attributes simultaneously from both sources. Extraction is guided by a small set of seed attributes and does not rely on further domainspecific knowledge. BWS is shown to improve extraction precision and also to improve attribute relevance across 40 test classes. © Springer-Verlag Berlin Heidelberg 2009.
CITATION STYLE
Reisinger, J., & Paşca, M. (2009). Low-cost supervision for multiple-source attribute extraction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5449 LNCS, pp. 382–393). https://doi.org/10.1007/978-3-642-00382-0_31
Mendeley helps you to discover research relevant for your work.