Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization

Prasetya Ajie Utama; Joshua Bambrick; Nafise Sadat Moosavi; Iryna Gurevych

Conference ProceedingsOPEN ACCESS

Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization

NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (2022) 2763-2776

DOI: 10.18653/v1/2022.naacl-main.199

31Citations

53Readers

Abstract

Neural abstractive summarization models are prone to generate summaries which are factually inconsistent with their source documents. Previous work has introduced the task of recognizing such factual inconsistency as a downstream application of natural language inference (NLI). However, state-of-the-art NLI models perform poorly in this context due to their inability to generalize to the target task. In this work, we show that NLI models can be effective for this task when the training data is augmented with high-quality task-oriented examples. We introduce Falsesum, a data generation pipeline leveraging a controllable text generation model to perturb human-annotated summaries, introducing varying types of factual inconsistencies. Unlike previously introduced document-level NLI datasets, our generated dataset contains examples that are diverse and inconsistent yet plausible. We show that models trained on a Falsesum-augmented NLI dataset improve the state-of-the-art performance across four benchmarks for detecting factual inconsistency in summarization.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Utama, P. A., Bambrick, J., Moosavi, N. S., & Gurevych, I. (2022). Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization. In NAACL 2022 - 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference (pp. 2763–2776). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.naacl-main.199

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 16

67%

Researcher 4

17%

Lecturer / Post doc 3

13%

Professor / Associate Prof. 1

Readers' Discipline

Computer Science 21

78%

Agricultural and Biological Sciences 2

Medicine and Dentistry 2

Linguistics 2

Falsesum: Generating Document-level NLI Examples for Recognizing Factual Inconsistency in Summarization

Abstract

References Powered by Scopus

A large annotated corpus for learning natural language inference

Abstractive text summarization using sequence-to-sequence RNNs and beyond

The PASCAL Recognising Textual Entailment Challenge

Cited by Powered by Scopus

ALIGNSCORE: Evaluating Factual Consistency with A Unified Alignment Function

LogiQA 2.0 - An Improved Dataset for Logical Reasoning in Natural Language Understanding

NonFactS: NonFactual Summary Generation for Factuality Evaluation in Document Summarization

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline