Erroneous data generation for grammatical error correction

19Citations
Citations of this article
94Readers
Mendeley users who have this article in their library.

Abstract

It has been demonstrated that the utilization of a monolingual corpus in neural Grammatical Error Correction (GEC) systems can significantly improve the system performance. The previous state-of-theart neural GEC system is an ensemble of four Transformer models pretrained on a large amount of Wikipedia Edits. The Singsound GEC system follows a similar approach but is equipped with a sophisticated erroneous data generating component. Our system achieved an F0:5 of 66.61 in the BEA 2019 Shared Task: Grammatical Error Correction. With our novel erroneous data generating component, the Singsound neural GEC system yielded an M2 of 63.2 on the CoNLL-2014 benchmark (8.4% relative improvement over the previous state-of-the-art system).

Cite

CITATION STYLE

APA

Xu, S., Zhang, J., Chen, J., & Qin, L. (2019). Erroneous data generation for grammatical error correction. In ACL 2019 - Innovative Use of NLP for Building Educational Applications, BEA 2019 - Proceedings of the 14th Workshop (pp. 149–158). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w19-4415

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free