Selective and Coverage Multi-head Attention for Abstractive Summarization

1Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Although the Transformer model has outperformed traditional sequence-To-sequence model in a variety of natural language processing (NLP) tasks, it still suffers from semantic irrelevance and repetition for abstractive text summarization. The main reason is that the long text to be summarized is usually composed of multi-sentences and has much redundant information. To tackle this problem, we propose a selective and coverage multi-head attention framework based on the original Transformer. It contains a Convolutional Neural Network (CNN) selective gate, which combines n-gram features with whole semantic representation to obtain core information from the long input sentence. Besides, we use a coverage mechanism in the multi-head attention to keep track of the words which have been summarized. The evaluations on Chinese and English text summarization datasets both demonstrate that the proposed selective and coverage multi-head attention model outperforms the baseline models by 4.6 and 0.3 ROUGE-2 points respectively. And the analysis shows that the proposed model generates the summary with higher quality and less repetition.

Cite

CITATION STYLE

APA

Zhang, X., & Liu, G. (2020). Selective and Coverage Multi-head Attention for Abstractive Summarization. In Journal of Physics: Conference Series (Vol. 1453). Institute of Physics Publishing. https://doi.org/10.1088/1742-6596/1453/1/012004

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free