Improving Grammatical Error Correction with Data Augmentation by Editing Latent Representation

32Citations
Citations of this article
110Readers
Mendeley users who have this article in their library.

Abstract

The incorporation of data augmentation method in grammatical error correction task has attracted much attention. However, existing data augmentation methods mainly apply noise to tokens, which leads to the lack of diversity of generated errors. In view of this, we propose a new data augmentation method that can apply noise to the latent representation of a sentence. By editing the latent representations of grammatical sentences, we can generate synthetic samples with various error types. Combining with some pre-defined rules, our method can greatly improve the performance and robustness of existing grammatical error correction models. We evaluate our method on public benchmarks of GEC task and it achieves the state-of-the-art performance on CoNLL-2014 and FCE benchmarks.

Cite

CITATION STYLE

APA

Wan, Z., Wan, X., & Wang, W. (2020). Improving Grammatical Error Correction with Data Augmentation by Editing Latent Representation. In COLING 2020 - 28th International Conference on Computational Linguistics, Proceedings of the Conference (pp. 2202–2212). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.coling-main.200

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free