Abstract
This research addresses the challenges of CrossLingual Summarization (CLS) in low-resource scenarios and over imbalanced multilingual data. Existing CLS studies mostly resort to pipeline frameworks or multi-task methods in bilingual settings. However, they ignore the data imbalance in multilingual scenarios and do not utilize the high-resource monolingual summarization data. In this paper, we propose the Aligned CROSs-lingual Summarization (ACROSS) model to tackle these issues. Our framework aligns low-resource cross-lingual data with high-resource monolingual data via contrastive and consistency loss, which help enrich low-resource information for high-quality summaries. In addition, we introduce a data augmentation method that can select informative monolingual sentences, which facilitates a deep exploration of high-resource information and introduce new information for low-resource languages. Experiments on the CrossSum dataset show that ACROSS outperforms baseline models and obtains consistently dominant performance on 45 language pairs.
Cite
CITATION STYLE
Li, P., Zhang, Z., Wang, J., Li, L., Jatowt, A., & Yang, Z. (2023). ACROSS: An Alignment-based Framework for Low-Resource Many-to-One Cross-Lingual Summarization. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 2458–2472). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.154
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.