Improving Grammatical Error Correction with Multimodal Feature Integration

11Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.

Abstract

Grammatical error correction (GEC) is a promising task aimed at correcting errors in a text. Many methods have been proposed to facilitate this task with remarkable results. However, most of them only focus on enhancing textual feature extraction without exploring the usage of other modalities' information (e.g., speech), which can also provide valuable knowledge to help the model detect grammatical errors. To shore up this deficiency, we propose a novel framework that integrates both speech and text features to enhance GEC. In detail, we create new multimodal GEC datasets for English and German by generating audio from text using the advanced text-to-speech models. Subsequently, we extract acoustic and textual representations by a multimodal encoder that consists of a speech and a text encoder. A mixture-of-experts (MoE) layer is employed to selectively align representations from the two modalities, and then a dot attention mechanism is used to fuse them as final multimodal representations. Experimental results on CoNLL14, BEA19 English, and Falko-MERLIN German show that our multimodal GEC models achieve significant improvements over strong baselines and achieve a new state-of-the-art result on the Falko-MERLIN test set.

Cite

CITATION STYLE

APA

Fang, T., Hu, J., Wong, D. F., Wan, X., Chao, L. S., & Chang, T. H. (2023). Improving Grammatical Error Correction with Multimodal Feature Integration. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 9328–9344). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.594

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free