Although the Transformer model has outperformed traditional sequence-To-sequence model in a variety of natural language processing (NLP) tasks, it still suffers from semantic irrelevance and repetition for abstractive text summarization. The main reason is that the long text to be summarized is usually composed of multi-sentences and has much redundant information. To tackle this problem, we propose a selective and coverage multi-head attention framework based on the original Transformer. It contains a Convolutional Neural Network (CNN) selective gate, which combines n-gram features with whole semantic representation to obtain core information from the long input sentence. Besides, we use a coverage mechanism in the multi-head attention to keep track of the words which have been summarized. The evaluations on Chinese and English text summarization datasets both demonstrate that the proposed selective and coverage multi-head attention model outperforms the baseline models by 4.6 and 0.3 ROUGE-2 points respectively. And the analysis shows that the proposed model generates the summary with higher quality and less repetition.
CITATION STYLE
Zhang, X., & Liu, G. (2020). Selective and Coverage Multi-head Attention for Abstractive Summarization. In Journal of Physics: Conference Series (Vol. 1453). Institute of Physics Publishing. https://doi.org/10.1088/1742-6596/1453/1/012004
Mendeley helps you to discover research relevant for your work.