Self-supervised Pre-training and Semi-supervised Learning for Extractive Dialog Summarization

Yingying Zhuang; Jiecheng Song; Narayanan Sadagopan; Anurag Beniwal

Conference ProceedingsOPEN ACCESS

Self-supervised Pre-training and Semi-supervised Learning for Extractive Dialog Summarization

ACM Web Conference 2023 - Companion of the World Wide Web Conference, WWW 2023 (2023) 1069-1076

DOI: 10.1145/3543873.3587680

1Citations

14Readers

Abstract

Language model pre-training has led to state-of-the-art performance in text summarization. While a variety of pre-trained transformer models are available nowadays, they are mostly trained on documents. In this study we introduce self-supervised pre-training to enhance the BERT model's semantic and structural understanding of dialog texts from social media. We also propose a semi-supervised teacher-student learning framework to address the common issue of limited available labels in summarization datasets. We empirically evaluate our approach on extractive summarization task with the TWEETSUMM corpus, a recently introduced dialog summarization dataset from Twitter customer care conversations and demonstrate that our self-supervised pre-training and semi-supervised teacher-student learning are both beneficial in comparison to other pre-trained models. Additionally, we compare pre-training and teacher-student learning in various low data-resource settings, and find that pre-training outperforms teacher-student learning and the differences between the two are more significant when the available labels are scarce.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhuang, Y., Song, J., Sadagopan, N., & Beniwal, A. (2023). Self-supervised Pre-training and Semi-supervised Learning for Extractive Dialog Summarization. In ACM Web Conference 2023 - Companion of the World Wide Web Conference, WWW 2023 (pp. 1069–1076). Association for Computing Machinery, Inc. https://doi.org/10.1145/3543873.3587680

Readers' Seniority

PhD / Post grad / Masters / Doc 4

100%

Readers' Discipline

Social Sciences 3

60%

Design 1

20%

Computer Science 1

20%

Self-supervised Pre-training and Semi-supervised Learning for Extractive Dialog Summarization

Abstract

Author supplied keywords

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline