Instant translation model adaptation by translating unseen words in continuous vector space

Shonosuke Ishiwatari; Naoki Yoshinaga; Masashi Toyoda; Masaru Kitsuregawa

Conference Proceedings

Instant translation model adaptation by translating unseen words in continuous vector space

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 9624 LNCS 51-62

DOI: 10.1007/978-3-319-75487-1_5

0Citations

3Readers

Get full text

Abstract

In statistical machine translation (smt), differences between domains of training and test data result in poor translations. Although there have been many studies on domain adaptation of language models and translation models, most require supervised in-domain language resources such as parallel corpora for training and tuning the models. The necessity of supervised data has made such methods difficult to adapt to practical smt systems. We thus propose a novel method that adapts translation models without in-domain parallel corpora. Our method infers translation candidates of unseen words by nearest-neighbor search after projecting their vector-based semantic representations to the semantic space of the target language. In our experiment of out-of-domain translation from Japanese to English, our method improved bleu score by 0.5–1.5.

Cite

CITATION STYLE

APA

Ishiwatari, S., Yoshinaga, N., Toyoda, M., & Kitsuregawa, M. (2018). Instant translation model adaptation by translating unseen words in continuous vector space. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9624 LNCS, pp. 51–62). Springer Verlag. https://doi.org/10.1007/978-3-319-75487-1_5

Instant translation model adaptation by translating unseen words in continuous vector space

Abstract

Cite

Register to see more suggestions