Hallucination Mitigation in Natural Language Generation from Large-Scale Open-Domain Knowledge Graphs

3Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

In generating natural language descriptions for knowledge graph triples, prior works used either small-scale, human-annotated datasets or datasets with limited variety of graph shapes, e.g., those having mostly star graphs. Graph-to-text models trained and evaluated on such datasets are largely not assessed for more realistic large-scale, open-domain settings. We introduce a new dataset, GraphNarrative, to fill this gap. Fine-tuning transformer-based pretrained language models has achieved state-of-the-art performance among graph-to-text models. However, this method suffers from information hallucination-the generated text may contain fabricated facts not present in input graphs. We propose a novel approach that, given a graph-sentence pair in GraphNarrative, trims the sentence to eliminate portions that are not present in the corresponding graph, by utilizing the sentence's dependency parse tree. Our experiment results verify this approach using models trained on GraphNarrative and existing datasets. The dataset, source code, and trained models are released at https://github.com/idirlab/graphnarrator.

Cite

CITATION STYLE

APA

Shi, X., Zhu, Z., Zhang, Z., & Li, C. (2023). Hallucination Mitigation in Natural Language Generation from Large-Scale Open-Domain Knowledge Graphs. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 12506–12521). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-main.770

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free