Does Masked Language Model Pre-training with Artificial Data Improve Low-resource Neural Machine Translation?

1Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.

Abstract

Pre-training masked language models (MLMs) with artificial data has been proven beneficial for several natural language processing tasks such as natural language understanding and summarization; however, it has been less explored for neural machine translation (NMT). A previous study revealed the benefit of transfer learning for NMT in a limited setup, which differs from MLM. In this study, we prepared two kinds of artificial data and compared the translation performance of NMT when pre-trained with MLM. In addition to the random sequences, we created artificial data mimicking token frequency information from the real world. Our results showed that pre-training the models with artificial data by MLM improves translation performance in low-resource situations. Additionally, we found that pre-training on artificial data created considering token frequency information facilitates improved performance.

References Powered by Scopus

Neural machine translation of rare words with subword units

4457Citations
N/AReaders
Get full text

A Call for Clarity in Reporting BLEU Scores

1978Citations
N/AReaders
Get full text

Chrf: Character n-gram f-score for automatic mt evaluation

998Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Challenging Large Language Models with New Tasks: A Study on their Adaptability and Robustness

0Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Tamura, H., Hirasawa, T., Kim, H., & Komachi, M. (2023). Does Masked Language Model Pre-training with Artificial Data Improve Low-resource Neural Machine Translation? In EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023 (pp. 2171–2180). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-eacl.166

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 4

67%

Lecturer / Post doc 1

17%

Researcher 1

17%

Readers' Discipline

Tooltip

Computer Science 8

80%

Medicine and Dentistry 1

10%

Neuroscience 1

10%

Save time finding and organizing research with Mendeley

Sign up for free