Sequence-to-sequence learning as beam-search optimization

321Citations
Citations of this article
696Readers
Mendeley users who have this article in their library.

Abstract

Sequence-to-Sequence (seq2seq) modeling has rapidly become an important general-purpose NLP tool that has proven effective for many text-generation and sequence-labeling tasks. Seq2seq builds on deep neural language modeling and inherits its remarkable accuracy in estimating local, next-word distributions. In this work, we introduce a model and beam-search training scheme, based on the work of Daumé III and Marcu (2005), that extends seq2seq to learn global sequence scores. This structured approach avoids classical biases associated with local training and unifies the training loss with the test-time usage, while preserving the proven model architecture of seq2seq and its efficient training approach. We show that our system outperforms a highly-optimized attention-based seq2seq system and other baselines on three different sequence to sequence tasks: word ordering, parsing, and machine translation.

Cite

CITATION STYLE

APA

Wiseman, S., & Rush, A. M. (2016). Sequence-to-sequence learning as beam-search optimization. In EMNLP 2016 - Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 1296–1306). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d16-1137

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free