IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models

4Citations
Citations of this article
34Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We introduce a new open information extraction (OIE) benchmark for pre-trained language models (LM). Recent studies have demonstrated that pre-trained LMs, such as BERT and GPT, may store linguistic and relational knowledge. In particular, LMs are able to answer “fill-in-the-blank” questions when given a pre-defined relation category. Instead of focusing on pre-defined relations, we create an OIE benchmark aiming to fully examine the open relational information present in the pre-trained LMs. We accomplish this by turning pre-trained LMs into zero-shot OIE systems. Surprisingly, pre-trained LMs are able to obtain competitive performance on both standard OIE datasets (CaRB and Re-OIE2016) and two new large-scale factual OIE datasets (TAC KBP-OIE and Wikidata-OIE) that we establish via distant supervision. For instance, the zero-shot pre-trained LMs outperform the F1 score of the state-of-the-art supervised OIE methods on our factual OIE datasets without needing to use any training sets.

Cite

CITATION STYLE

APA

Wang, C., Liu, X., & Song, D. (2022). IELM: An Open Information Extraction Benchmark for Pre-Trained Language Models. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 8417–8437). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-main.576

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free