Using Weak Supervision to Identify Long-Tail Entities for Knowledge Base Completion

Yaser Oulabi; Christian Bizer

Conference ProceedingsOPEN ACCESS

Using Weak Supervision to Identify Long-Tail Entities for Knowledge Base Completion

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2019) 11702 LNCS 83-98

DOI: 10.1007/978-3-030-33220-4_7

2Citations

5Readers

Abstract

Data from relational web tables can be used to augment cross-domain knowledge bases like DBpedia, Wikidata, or the Google Knowledge Graph with descriptions of entities that are not yet part of the knowledge base. Such long-tail entities can include for instance small villages, niche songs, or athletes that play in lower-level leagues. In previous work, we have presented an approach to successfully assemble descriptions of long-tail entities from relational HTML tables using supervised matching methods and manually labeled training data in the form of positive and negative entity matches. Manually labeling training data is a laborious task given knowledge bases covering many different classes. In this work, we investigate reducing the labeling effort for the task of long-tail entity extraction by using weak supervision. We present a bootstrapping approach that requires domain experts to provide a small set of simple, class-specific matching rules, instead of requiring them to label a large set of entity matches, thereby reducing the human supervision effort considerably. We evaluate this weak supervision approach and find that it performs only slightly worse compared to methods that rely on large sets of manually labeled entity matches.

Cite

CITATION STYLE

APA

Oulabi, Y., & Bizer, C. (2019). Using Weak Supervision to Identify Long-Tail Entities for Knowledge Base Completion. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11702 LNCS, pp. 83–98). Springer. https://doi.org/10.1007/978-3-030-33220-4_7

Using Weak Supervision to Identify Long-Tail Entities for Knowledge Base Completion

Abstract

Cite

Register to see more suggestions