This paper describes our ongoing work on grammatical error correction (GEC). Focusing on all possible error types in a real-life environment, we propose a factored statistical machine translation (SMT) model for this task. We consider error correction as a series of language translation problems guided by various linguistic information, as factors that influence translation results. Factors included in our study are morphological information, i.e. word stem, prefix, suffix, and Part-of-Speech (PoS) information. In addition, we also experimented with different combinations of translation models (TM), phrase-based and factor-based, trained on various datasets to boost the overall performance. Empirical results show that the proposed model yields an improvement of 32.54% over a baseline phrase-based SMT model. The system participated in the CoNLL 2014 shared task and achieved the 7th and 5th F0.5 scores1 on the official test set among the thirteen participating teams.
CITATION STYLE
Wang, Y., Wang, L., Wong, D. F., Chao, L. S., Zeng, X., & Lu, Y. (2014). Factored statistical machine translation for grammatical error correction. In CoNLL 2014 - 18th Conference on Computational Natural Language Learning, Proceedings of the Shared Task (pp. 83–90). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/w14-1711
Mendeley helps you to discover research relevant for your work.