Abstract
Neural approaches to natural language generation in task-oriented dialogue have typically required large amounts of annotated training data to achieve satisfactory performance, especially when generating from compositional inputs. To address this issue, we show that self-training enhanced with constrained decoding yields large gains in data efficiency on a conversational weather dataset that employs compositional meaning representations. In particular, our experiments indicate that self-training with constrained decoding can enable sequence-to-sequence models to achieve satisfactory quality using vanilla decoding with five to ten times less data than with ordinary supervised baseline; moreover, by leveraging pretrained models, data efficiency can be increased further to fifty times. We confirm the main automatic results with human evaluations and show that they extend to an enhanced, compositional version of the E2E dataset. The end result is an approach that makes it possible to achieve acceptable performance on compositional NLG tasks using hundreds rather than tens of thousands of training samples.
Cite
CITATION STYLE
Li, X., Stevens-Guille, S. J., Maskharashvili, A., & White, M. (2021). Self-Training for Compositional Neural NLG in Task-Oriented Dialogue. In INLG 2021 - 14th International Conference on Natural Language Generation, Proceedings (pp. 87–102). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.inlg-1.10
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.