Cross-lingual summarization aims to help people efficiently grasp the core idea of the document written in a foreign language. Modern text summarization models generate highly fluent but often factually inconsistent outputs, which has received heightened attention in recent research. However, the factual consistency of cross-lingual summarization has not been investigated yet. In this paper, we propose a cross-lingual factuality dataset by collecting human annotations of reference summaries as well as generated summaries from models at both summary level and sentence level. Furthermore, we perform the fine-grained analysis and observe that over 50% of generated summaries and over 27% of reference summaries contain factual errors with characteristics different from monolingual summarization. Existing evaluation metrics for monolingual summarization require translation to evaluate the factuality of cross-lingual summarization and perform differently at different tasks and levels. Finally, we adapt the monolingual factuality metrics as an initial step towards the automatic evaluation of summarization factuality in cross-lingual settings. Our dataset and code are available at https://github.com/kite99520/Fact_CLS.
CITATION STYLE
Gao, M., Wang, W., Wan, X., & Xu, Y. (2023). Evaluating Factuality in Cross-lingual Summarization. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 12415–12431). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.786
Mendeley helps you to discover research relevant for your work.