Factored statistical machine translation for grammatical error correction

6Citations
Citations of this article
83Readers
Mendeley users who have this article in their library.

Abstract

This paper describes our ongoing work on grammatical error correction (GEC). Focusing on all possible error types in a real-life environment, we propose a factored statistical machine translation (SMT) model for this task. We consider error correction as a series of language translation problems guided by various linguistic information, as factors that influence translation results. Factors included in our study are morphological information, i.e. word stem, prefix, suffix, and Part-of-Speech (PoS) information. In addition, we also experimented with different combinations of translation models (TM), phrase-based and factor-based, trained on various datasets to boost the overall performance. Empirical results show that the proposed model yields an improvement of 32.54% over a baseline phrase-based SMT model. The system participated in the CoNLL 2014 shared task and achieved the 7th and 5th F0.5 scores1 on the official test set among the thirteen participating teams.

Cite

CITATION STYLE

APA

Wang, Y., Wang, L., Wong, D. F., Chao, L. S., Zeng, X., & Lu, Y. (2014). Factored statistical machine translation for grammatical error correction. In CoNLL 2014 - 18th Conference on Computational Natural Language Learning, Proceedings of the Shared Task (pp. 83–90). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w14-1711

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free