Length Does Matter: Summary Length can Bias Summarization Metrics

Xiaobo Guo; Soroush Vosoughi

Conference ProceedingsOPEN ACCESS

Length Does Matter: Summary Length can Bias Summarization Metrics

EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (2023) 15869-15879

DOI: 10.18653/v1/2023.emnlp-main.984

4Citations

9Readers

Abstract

Establishing the characteristics of an effective summary is a complicated and often subjective endeavor. Consequently, the development of metrics for the summarization task has become a dynamic area of research within natural language processing. In this paper, we reveal that existing summarization metrics exhibit a bias toward the length of generated summaries. Our thorough experiments, conducted on a variety of datasets, metrics, and models, substantiate these findings. The results indicate that most metrics tend to favor longer summaries, even after accounting for other factors. To address this issue, we introduce a Bayesian normalization technique that effectively diminishes this bias. We demonstrate that our approach significantly improves the concordance between human annotators and the majority of metrics in terms of summary coherence.

Cite

CITATION STYLE

APA

Guo, X., & Vosoughi, S. (2023). Length Does Matter: Summary Length can Bias Summarization Metrics. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 15869–15879). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-main.984

Length Does Matter: Summary Length can Bias Summarization Metrics

Abstract

Cite

Register to see more suggestions