We report on our work-in-progress to generate a synthetic error dataset for Swedish by replicating errors observed in the authentic error-annotated dataset. We analyze a small subset of authentic errors, capture regular patterns based on parts of speech, and design a set of rules to corrupt new data. We explore the approach and identify its capabilities, advantages and limitations as a way to enrich the existing collection of error-annotated data. This work focuses on word order errors, specifically those involving the placement of finite verbs in a sentence.
CITATION STYLE
Moner, J. C., & Volodina, E. (2022). Generation of Synthetic Error Data of Verb Order Errors for Swedish. In BEA 2022 - 17th Workshop on Innovative Use of NLP for Building Educational Applications, Proceedings (pp. 33–38). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.bea-1.6
Mendeley helps you to discover research relevant for your work.