Correcting Chinese Spelling Errors with Phonetic Pre-training

73Citations
Citations of this article
86Readers
Mendeley users who have this article in their library.

Abstract

Chinese spelling correction (CSC) is an important yet challenging task. Existing state-of-the-art methods either only use a pre-trained language model or incorporate phonological information as external knowledge. In this paper, we propose a novel end-to-end CSC model that integrates phonetic features into language model by leveraging the powerful pre-training and fine-tuning method. Instead of conventionally masking words with a special token in training language model, we replace words with phonetic features and their sound-alike words. We further propose an adaptive weighted objective to jointly train error detection and correction in a unified framework. Experimental results show that our model achieves significant improvements on SIGHAN datasets and outperforms the previous state-of-the-art methods.

Cite

CITATION STYLE

APA

Zhang, R., Pang, C., Zhang, C., Wang, S., He, Z., Sun, Y., … Wang, H. (2021). Correcting Chinese Spelling Errors with Phonetic Pre-training. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 2250–2261). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.198

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free