Chinese spelling correction (CSC) is an important yet challenging task. Existing state-of-the-art methods either only use a pre-trained language model or incorporate phonological information as external knowledge. In this paper, we propose a novel end-to-end CSC model that integrates phonetic features into language model by leveraging the powerful pre-training and fine-tuning method. Instead of conventionally masking words with a special token in training language model, we replace words with phonetic features and their sound-alike words. We further propose an adaptive weighted objective to jointly train error detection and correction in a unified framework. Experimental results show that our model achieves significant improvements on SIGHAN datasets and outperforms the previous state-of-the-art methods.
CITATION STYLE
Zhang, R., Pang, C., Zhang, C., Wang, S., He, Z., Sun, Y., … Wang, H. (2021). Correcting Chinese Spelling Errors with Phonetic Pre-training. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 2250–2261). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.198
Mendeley helps you to discover research relevant for your work.