MNLP at MEDIQA 2021: Fine-Tuning PEGASUS for Consumer Health Question Summarization

6Citations
Citations of this article
54Readers
Mendeley users who have this article in their library.

Abstract

This paper details a Consumer Health Question (CHQ) summarization model submitted to MEDIQA 2021 for shared task 1: Question Summarization. Many CHQs are composed of multiple sentences with typos or unnecessary information, which can interfere with automated question answering systems. Question summarization mitigates this issue by removing this unnecessary information, aiding automated systems in generating a more accurate summary. Our summarization approach focuses on applying multiple pre-processing techniques, including question focus identification on the input and the development of an ensemble method to combine question focus with an abstractive summarization method. We use the state-of-art abstractive summarization model, PEGASUS (Pre-training with Extracted Gap-sentences for Abstractive Summarization), to generate abstractive summaries. Our experiments show that using our ensemble method, which combines abstractive summarization with question focus identification, improves performance over using summarization alone. Our model shows a ROUGE-2 F-measure of 11.14% against the official test dataset.

Cite

CITATION STYLE

APA

Dang, H. N., Lee, J., Henry, S., & Uzuner, Ö. (2021). MNLP at MEDIQA 2021: Fine-Tuning PEGASUS for Consumer Health Question Summarization. In Proceedings of the 20th Workshop on Biomedical Language Processing, BioNLP 2021 (pp. 320–327). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.bionlp-1.37

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free