FFCI: A Framework for Interpretable Automatic Evaluation of Summarization

9Citations
Citations of this article
44Readers
Mendeley users who have this article in their library.

Abstract

In this paper, we propose FFCI, a framework for fine-grained summarization evaluation that comprises four elements: faithfulness (degree of factual consistency with the source), focus (precision of summary content relative to the reference), coverage (recall of summary content relative to the reference), and inter-sentential coherence (document fluency between adjacent sentences). We construct a novel dataset for focus, coverage, and inter-sentential coherence, and develop automatic methods for evaluating each of the four dimensions of FFCI based on cross-comparison of evaluation metrics and model-based evaluation methods, including question answering (QA) approaches, semantic textual similarity (STS), next-sentence prediction (NSP), and scores derived from 19 pre-trained language models. We then apply the developed metrics in evaluating a broad range of summarization models across two datasets, with some surprising findings.

Cite

CITATION STYLE

APA

Koto, F., Baldwin, T., & Lau, J. H. (2022). FFCI: A Framework for Interpretable Automatic Evaluation of Summarization. Journal of Artificial Intelligence Research, 73, 1553–1607. https://doi.org/10.1613/jair.1.13167

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free