Improving grammatical error correction with machine translation pairs

21Citations
Citations of this article
86Readers
Mendeley users who have this article in their library.

Abstract

We propose a novel data synthesis method to generate diverse error-corrected sentence pairs for improving grammatical error correction, which is based on a pair of machine translation models (e.g., Chinese-English) of different qualities (i.e., poor and good). The poor translation model can resemble the ESL (En-gush as a second language) learner and tends to generate translations of low quality in terms of fluency and grammaticahty, while the good translation model generally generates fluent and grammatically correct translations. With the pair of translation models, we can generate unlimited numbers of poor-good English sentence pairs from text in the source language (e.g., Chinese) of the translators. Our approach can generate various error-corrected patterns and nicely complement the other data synthesis approaches for GEC. Experimental results demonstrate the data generated by our approach can effectively help a GEC model to improve the performance and approaching the state-of-the-art single-model performance in BEA-19 and CoNLL-14 benchmark datasets.

Cite

CITATION STYLE

APA

Zhou, W., Ge, T., Mu, C., Xu, K., Wei, F., & Zhou, M. (2020). Improving grammatical error correction with machine translation pairs. In Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020 (pp. 318–328). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.findings-emnlp.30

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free