Does Masked Language Model Pre-training with Artificial Data Improve Low-resource Neural Machine Translation?

Hiroto Tamura; Tosho Hirasawa; Hwichan Kim; Mamoru Komachi

Conference ProceedingsOPEN ACCESS

Does Masked Language Model Pre-training with Artificial Data Improve Low-resource Neural Machine Translation?

EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023 (2023) 2171-2180

DOI: 10.18653/v1/2023.findings-eacl.166

1Citations

14Readers

Abstract

Pre-training masked language models (MLMs) with artificial data has been proven beneficial for several natural language processing tasks such as natural language understanding and summarization; however, it has been less explored for neural machine translation (NMT). A previous study revealed the benefit of transfer learning for NMT in a limited setup, which differs from MLM. In this study, we prepared two kinds of artificial data and compared the translation performance of NMT when pre-trained with MLM. In addition to the random sequences, we created artificial data mimicking token frequency information from the real world. Our results showed that pre-training the models with artificial data by MLM improves translation performance in low-resource situations. Additionally, we found that pre-training on artificial data created considering token frequency information facilitates improved performance.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Tamura, H., Hirasawa, T., Kim, H., & Komachi, M. (2023). Does Masked Language Model Pre-training with Artificial Data Improve Low-resource Neural Machine Translation? In EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023 (pp. 2171–2180). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-eacl.166

Readers' Seniority

PhD / Post grad / Masters / Doc 4

67%

Lecturer / Post doc 1

17%

Researcher 1

17%

Readers' Discipline

Computer Science 8

80%

Medicine and Dentistry 1

10%

Neuroscience 1

10%

Does Masked Language Model Pre-training with Artificial Data Improve Low-resource Neural Machine Translation?

Abstract

References Powered by Scopus

Neural machine translation of rare words with subword units

A Call for Clarity in Reporting BLEU Scores

Chrf: Character n-gram f-score for automatic mt evaluation

Cited by Powered by Scopus

Challenging Large Language Models with New Tasks: A Study on their Adaptability and Robustness

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline