Particle Error Correction from Small Error Data for Japanese Learners

Kenji Imamura; Kuniko Saito; Kugatsu Sadamitsu; Hitoshi Nishikawa

Journal ArticleOPEN ACCESS

Particle Error Correction from Small Error Data for Japanese Learners

Imamura K
Saito K
Sadamitsu K
et al.

Journal of Natural Language Processing (2014) 21(4) 941-963

DOI: 10.5715/jnlp.21.941

N/ACitations

7Readers

Abstract

This paper shows how to correct the grammatical errors of Japanese particles made by Japanese learners. Our method is based on discriminative sequence conversion, which converts one sequence of words into another and corrects particle errors by substitution, insertion, or deletion. However, it is difficult to collect large learners' corpora. We solve this problem with a discriminative learning framework that uses the following two methods. First, language model probabilities obtained from large, raw text corpora are combined with n-gram binary features obtained from learners' corpora. This method is applied to measure the accuracy of Japanese sentences. Second, automatically generated pseudo-error sentences are added to learners' corpora to enrich the corpora directly. Furthermore, we apply domain adaptation, in which the pseudo-error sentences (the source domain) are adapted to the real error sentences (the target domain). Experiments show that the recall rate is improved using both language model probabilities and n-gram binary features. Stable improvement is achieved using pseudo-error sentences with domain adaptation.

Cite

CITATION STYLE

APA

Imamura, K., Saito, K., Sadamitsu, K., & Nishikawa, H. (2014). Particle Error Correction from Small Error Data for Japanese Learners. Journal of Natural Language Processing, 21(4), 941–963. https://doi.org/10.5715/jnlp.21.941

Particle Error Correction from Small Error Data for Japanese Learners

Abstract

Cite

Register to see more suggestions