Reducing Wrong Labels in Distant Supervision for Relation Extraction

  • Takamatsu S
  • Sato I
  • Nakagawa H
  • 91


    Mendeley users who have this article in their library.
  • 62


    Citations of this article.


In relation extraction, distant supervision seeks to extract relations between entities from text by using a knowledge base, such as Freebase, as a source of supervision. When a sentence and a knowledge base refer to the same entity pair, this approach heuristically la-bels the sentence with the corresponding re-lation in the knowledge base. However, this heuristic can fail with the result that some sen-tences are labeled wrongly. This noisy labeled data causes poor extraction performance. In this paper, we propose a method to reduce the number of wrong labels. We present a novel generative model that directly models the heuristic labeling process of distant super-vision. The model predicts whether assigned labels are correct or wrong via its hidden vari-ables. Our experimental results show that this model detected wrong labels with higher per-formance than baseline methods. In the ex-periment, we also found that our wrong label reduction boosted the performance of relation extraction.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

  • SCOPUS: 2-s2.0-84878213872
  • SGR: 84878213872
  • ISBN: 9781937284244
  • PUI: 368987849


  • Shingo Takamatsu

  • Issei Sato

  • Hiroshi Nakagawa

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free