CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

Potsawee Manakul; Yassir Fathullah; Adian Liusie; Vyas Raina; Vatsal Raina; Mark Gales

Conference ProceedingsOPEN ACCESS

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2023) 516-523

DOI: 10.18653/v1/2023.bionlp-1.51

8Citations

10Readers

Abstract

In this paper, we consider the challenge of summarizing patients’ medical progress notes in a limited data setting. For the Problem List Summarization (shared task 1A) at the BioNLP Workshop 2023, we demonstrate that Clinical-T5 fine-tuned to 765 medical clinic notes outperforms other extractive, abstractive and zero-shot baselines, yielding reasonable baseline systems for medical note summarization. Further, we introduce Hierarchical Ensemble of Summarization Models (HESM), consisting of token-level ensembles of diverse fine-tuned Clinical-T5 models, followed by Minimum Bayes Risk (MBR) decoding. Our HESM approach lead to a considerable summarization performance boost, and when evaluated on held-out challenge data achieved a ROUGE-L of 32.77, which was the best-performing system at the top of the shared task leaderboard.

Cite

CITATION STYLE

APA

Manakul, P., Fathullah, Y., Liusie, A., Raina, V., Raina, V., & Gales, M. (2023). CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 516–523). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.bionlp-1.51

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models

Abstract

Cite

Register to see more suggestions