Neural extractive summarization methods often require much labeled training data, for which headlines or lead summaries of news articles can sometimes be used. Such directly useful summaries are not always available, however, especially for user-generated content, such as questions posted on community question answering services. In this paper, we address an extractive summarization (i.e., headline extraction) task for such questions as a case study and consider how to alleviate the problem by using question-answer pairs, instead of missing-headline pairs. To this end, we propose a framework to examine how to use such unlabeled paired data from the viewpoint of training methods. Experimental results show that multi-task training performs well with undersampling and distant supervision.
CITATION STYLE
Machida, K., Ishigaki, T., Kobayashi, H., Takamura, H., & Okumura, M. (2020). Semi-supervised extractive question summarization using question-answer pairs. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12036 LNCS, pp. 255–264). Springer. https://doi.org/10.1007/978-3-030-45442-5_32
Mendeley helps you to discover research relevant for your work.