Select, Prompt, Filter: Distilling Large Language Models for Summarizing Conversations

Minh Quang Pham; Sathish Reddy Indurthi; Shamil Chollampatt; Marco Turchi

Conference ProceedingsOPEN ACCESS

Select, Prompt, Filter: Distilling Large Language Models for Summarizing Conversations

EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (2023) 12257-12265

DOI: 10.18653/v1/2023.emnlp-main.753

2Citations

11Readers

Abstract

Large language models (LLMs) like ChatGPT can be expensive to train, deploy, and use for specific natural language generation tasks such as text summarization and for certain domains. A promising alternative is to fine-tune relatively smaller language models (LMs) on a particular task using high-quality, in-domain datasets. However, it can be prohibitively expensive to get such high-quality training data. This issue has been mitigated by generating weakly supervised data via knowledge distillation (KD) of LLMs. We propose a three-step approach to distill ChatGPT and fine-tune smaller LMs for summarizing forum conversations. More specifically, we design a method to selectively sample a large unannotated corpus of forum conversation using a semantic similarity metric. Then, we use the same metric to retrieve suitable prompts for ChatGPT from a small annotated validation set in the same domain. The generated dataset is then filtered to remove low-quality instances. Our proposed select-prompt-filter KD approach leads to significant improvements of up to 6.6 ROUGE-2 score by leveraging sufficient in-domain pseudo-labelled data, over a standard KD approach given the same size of training data.

Cite

CITATION STYLE

APA

Pham, M. Q., Indurthi, S. R., Chollampatt, S., & Turchi, M. (2023). Select, Prompt, Filter: Distilling Large Language Models for Summarizing Conversations. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 12257–12265). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-main.753

Select, Prompt, Filter: Distilling Large Language Models for Summarizing Conversations

Abstract

Cite

Register to see more suggestions