HighLife: Higher-arity fact harvesting

26Citations
Citations of this article
62Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Text-based knowledge extraction methods for populating knowledge bases have focused on binary facts: relationships between two entities. However, in advanced domains such as health, it is often crucial to consider ternary and higher-arity relations. An example is to capture which drug is used for which disease at which dosage (e.g. 2.5 mg/day) for which kinds of patients (e.g., children vs. adults). In this work, we present an approach to harvest higher-arity facts from textual sources. Our method is distantly supervised by seed facts, and uses the fact-pattern duality principle to gather fact candidates with high recall. For high precision, we devise a constraint-based reasoning method to eliminate false candidates. A major novelty is in coping with the difficulty that higher-arity facts are often expressed only partially in texts and strewn across multiple sources. For example, one sentence may refer to a drug, a disease and a group of patients, whereas another sentence talks about the drug, its dosage and the target group without mentioning the disease. Our methods cope well with such partially observed facts, at both pattern-learning and constraint-reasoning stages. Experiments with health-related documents and with news articles demonstrate the viability of our method.

Cite

CITATION STYLE

APA

Ernst, P., Siu, A., & Weikum, G. (2018). HighLife: Higher-arity fact harvesting. In The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018 (pp. 1013–1022). Association for Computing Machinery, Inc. https://doi.org/10.1145/3178876.3186000

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free