Abstract
Sequence to sequence (SEQ2SEQ) models often lack diversity in their generated translations. This can be attributed to the limitation of SEQ2SEQ models in capturing lexical and syntactic variations in a parallel corpus resulting from different styles, genres, topics, or ambiguity of the translation process. In this paper, we develop a novel sequence to sequence mixture (S2SMIX) model that improves both translation diversity and quality by adopting a committee of specialized translation models rather than a single translation model. Each mixture component selects its own training dataset via optimization of the marginal log-likelihood, which leads to a soft clustering of the parallel corpus. Experiments on four language pairs demonstrate the superiority of our mixture model compared to a SEQ2SEQ baseline with standard or diversity-boosted beam search. Our mixture model uses negligible additional parameters and incurs no extra computation cost during decoding.
Cite
CITATION STYLE
He, X., Haffari, G., & Norouzi, M. (2018). Sequence to sequence mixture model for diverse machine translation. In CoNLL 2018 - 22nd Conference on Computational Natural Language Learning, Proceedings (pp. 583–592). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/k18-1056
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.