The attention-based encoder-decoder framework is widely used in many natural language generation tasks. The attention mechanism builds alignments between target words and source items that facilitate text generation. Previous work proposes supervised attention that uses human knowledge to guide the attention mechanism to learn better alignments. However, well-designed supervision built from ideal alignments can be costly or even infeasible. In this paper, we build a Generalized Supervised Attention method (GSA) based on quasi alignments, which specify candidate sets of alignments and are much easier to obtain than ideal alignments. We design a Summation Cross-Entropy (SCE) loss and a Supervised Multiple Attention (SMA) structure to accommodate quasi alignments. Experiments on three text generation tasks demonstrate that GSA improves generation performance and is robust against errors in attention supervision.
CITATION STYLE
Liu, Y., Zhang, L., Zhang, X., Jiang, Y., Zhang, Y., & Tu, K. (2021). Generalized Supervised Attention for Text Generation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 4991–5003). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.442
Mendeley helps you to discover research relevant for your work.