Graph-based filtering of out-of-vocabularywords for encoder-decoder models

Satoru Katsumata; Yukio Matsumura; Hayahide Yamagishi; Mamoru Komachi

Conference ProceedingsOPEN ACCESS

Graph-based filtering of out-of-vocabularywords for encoder-decoder models

ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Student Research Workshop (2018) 112-119

DOI: 10.18653/v1/p18-3016

1Citations

71Readers

Abstract

Encoder-decoder models typically only employ words that are frequently used in the training corpus to reduce the compu- tational costs and exclude noise. How- ever, this vocabulary set may still in- clude words that interfere with learning in encoder-decoder models. This paper pro- poses a method for selecting more suit- able words for learning encoders by uti- lizing not only frequency but also co- occurrence information, which we capture using the HITS algorithm. We apply our proposed method to two tasks: Machine translation and grammatical error correc- tion. For Japanese-to-English translation, this method achieves a BLEU score that is 0.56 points more than that of a baseline. Furthermore, it outperforms the baseline method for English grammatical error cor- rection, with an F0:5-measure that is 1.48 points higher.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Katsumata, S., Matsumura, Y., Yamagishi, H., & Komachi, M. (2018). Graph-based filtering of out-of-vocabularywords for encoder-decoder models. In ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Student Research Workshop (pp. 112–119). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p18-3016

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 20

65%

Researcher 8

26%

Lecturer / Post doc 2

Professor / Associate Prof. 1

Readers' Discipline

Computer Science 24

67%

Linguistics 8

22%

Engineering 2

Business, Management and Accounting 2

Graph-based filtering of out-of-vocabularywords for encoder-decoder models

Abstract

References Powered by Scopus

Learning phrase representations using RNN encoder-decoder for statistical machine translation

Authoritative sources in a hyperlinked environment

Neural machine translation of rare words with subword units

Cited by Powered by Scopus

A Diachronic Assessment of Research on Machine Translation Methodology

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline