Make the Most of Prior Data: A Solution for Interactive Text Summarization with Preference Feedback

7Citations
Citations of this article
34Readers
Mendeley users who have this article in their library.

Abstract

For summarization, human preferences is critical to tame outputs of the summarizer in favor of human interests, as ground-truth summaries are scarce and ambiguous. Practical settings require dynamic exchanges between humans and AI agents wherein feedback is provided in an online manner, a few at a time. In this paper, we introduce a new framework to train summarization models with preference feedback interactively. By properly leveraging offline data and a novel reward model, we improve the performance regarding ROUGE scores and sample-efficiency. Our experiments on three various datasets confirm the benefit of the proposed framework in active, few-shot and online settings of preference learning.

Cite

CITATION STYLE

APA

Nguyen, D. H., Nghiem, N. V. D., Nguyen, B. S., Le, D. T., Sabahi, S., Nguyen, M. T., & Le, H. (2022). Make the Most of Prior Data: A Solution for Interactive Text Summarization with Preference Feedback. In Findings of the Association for Computational Linguistics: NAACL 2022 - Findings (pp. 1919–1930). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-naacl.147

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free