In relation extraction, distant supervision seeks to extract relations between entities from text by using a knowledge base, such as Freebase, as a source of supervision. When a sentence and a knowledge base refer to the same entity pair, this approach heuristically la-bels the sentence with the corresponding re-lation in the knowledge base. However, this heuristic can fail with the result that some sen-tences are labeled wrongly. This noisy labeled data causes poor extraction performance. In this paper, we propose a method to reduce the number of wrong labels. We present a novel generative model that directly models the heuristic labeling process of distant super-vision. The model predicts whether assigned labels are correct or wrong via its hidden vari-ables. Our experimental results show that this model detected wrong labels with higher per-formance than baseline methods. In the ex-periment, we also found that our wrong label reduction boosted the performance of relation extraction.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below