Length Does Matter: Summary Length can Bias Summarization Metrics

4Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

Establishing the characteristics of an effective summary is a complicated and often subjective endeavor. Consequently, the development of metrics for the summarization task has become a dynamic area of research within natural language processing. In this paper, we reveal that existing summarization metrics exhibit a bias toward the length of generated summaries. Our thorough experiments, conducted on a variety of datasets, metrics, and models, substantiate these findings. The results indicate that most metrics tend to favor longer summaries, even after accounting for other factors. To address this issue, we introduce a Bayesian normalization technique that effectively diminishes this bias. We demonstrate that our approach significantly improves the concordance between human annotators and the majority of metrics in terms of summary coherence.

Cite

CITATION STYLE

APA

Guo, X., & Vosoughi, S. (2023). Length Does Matter: Summary Length can Bias Summarization Metrics. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 15869–15879). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-main.984

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free