Recently, encoder-decoder model using attention has shown meaningful results in the abstractive summarization tasks. In the attention mechanism, the attention distribution is generated based only on the current decoder state. However, since there are patterns in the process of writing summaries, patterns will exist even in the process of paying attention. In this work, we propose the attention history-based attention model that considers such patterns of the attention history. We build an additional recurrent network, the attention reader network to model the attention patterns. Also, we employ an accumulation vector that keeps the total amount of effective attention to each part of the input text, which is guided by an additional network named the accumulation network. Both the attention reader network and the accumulation vector are used as the additional inputs to the attention mechanism. The evaluation results on the CNN/Daily Mail dataset show that our method better captures the attention pattern and achieves higher ROUGE scores than strong baselines.
CITATION STYLE
Lee, H., Choi, Y., & Lee, J. H. (2020). Attention history-based attention for abstractive text summarization. In Proceedings of the ACM Symposium on Applied Computing (pp. 1075–1081). Association for Computing Machinery. https://doi.org/10.1145/3341105.3373892
Mendeley helps you to discover research relevant for your work.