Agreement on target-bidirectional LSTMs for sequence-to-sequence learning

34Citations
Citations of this article
41Readers
Mendeley users who have this article in their library.

Abstract

Recurrent neural networks, particularly the long short-term memory networks, are extremely appealing for sequence-tosequence learning tasks. Despite their great success, they typically suffer from a fundamental shortcoming: they are prone to generate unbalanced targets with good prefixes but bad suffixes, and thus performance suffers when dealing with long sequences. We propose a simple yet effective approach to overcome this shortcoming. Our approach relies on the agreement between a pair of target-directional LSTMs, which generates more balanced targets. In addition, we develop two efficient approximate search methods for agreement that are empirically shown to be almost optimal in terms of sequence-level losses. Extensive experiments were performed on two standard sequence-to-sequence transduction tasks: machine transliteration and grapheme-to-phoneme transformation. The results show that the proposed approach achieves consistent and substantial improvements, compared to six state-of-the-art systems. In particular, our approach outperforms the best reported error rates by a margin (up to 9% relative gains) on the grapheme-to-phoneme task. Our toolkit is publicly available on https://github.com/lemaoliu/Agtarbidir.

Cite

CITATION STYLE

APA

Liu, L., Finch, A., Utiyama, M., & Sumita, E. (2016). Agreement on target-bidirectional LSTMs for sequence-to-sequence learning. In 30th AAAI Conference on Artificial Intelligence, AAAI 2016 (pp. 2630–2637). AAAI press. https://doi.org/10.1609/aaai.v30i1.10327

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free