New Dataset and Strong Baselines for the Grammatical Error Correction of Russian

11Citations
Citations of this article
47Readers
Mendeley users who have this article in their library.

Abstract

Motivated by recent advancements in grammatical error correction in English and existing issues in the field, we describe a new resource, an annotated learner corpus of Russian, extracted from the Lang-8 language learning website. This new dataset is benchmarked against two grammatical error correction models that use state-of-the-art neural architectures. Results are provided on the newly-created corpus and are compared against performance on another, existing resource. We also evaluate the contribution of the Lang-8 training data to the grammatical error correction of Russian and perform type-based analysis of the models. The expert annotations are available for research purposes.

Cite

CITATION STYLE

APA

Trinh, V. A., & Rozovskaya, A. (2021). New Dataset and Strong Baselines for the Grammatical Error Correction of Russian. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 4103–4111). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.359

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free