Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models

Junpeng Li; Zixia Jia; Zilong Zheng

Conference Proceedings

Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models

EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (2023) 5495-5505

DOI: 10.18653/v1/2023.emnlp-main.334

35Citations

19Readers

Get full text

Abstract

Document-level Relation Extraction (DocRE), which aims to extract relations from a long context, is a critical challenge in achieving fine-grained structural comprehension and generating interpretable document representations. Inspired by recent advances in in-context learning capabilities emergent from large language models (LLMs), such as ChatGPT, we aim to design an automated annotation method for DocRE with minimum human effort. Unfortunately, vanilla in-context learning is infeasible for document-level Relation Extraction (RE) due to the plenty of predefined fine-grained relation types and the uncontrolled generations of LLMs. To tackle this issue, we propose a method integrating a Large Language Model (LLM) and a natural language inference (NLI) module to generate relation triples, thereby augmenting document-level relation datasets. We demonstrate the effectiveness of our approach by introducing an enhanced dataset known as DocGNRE, which excels in re-annotating numerous long-tail relation types. We are confident that our method holds the potential for broader applications in domain-specific relation type definitions and offers tangible benefits in advancing generalized language semantic comprehension.

Cite

CITATION STYLE

APA

Li, J., Jia, Z., & Zheng, Z. (2023). Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 5495–5505). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-main.334

Semi-automatic Data Enhancement for Document-Level Relation Extraction with Distant Supervision from Large Language Models

Abstract

Cite

Register to see more suggestions