Auto regressive text generation for low-resource languages, particularly the option of using pre-trained language models, is a relatively under-explored problem. In this paper, we model Math Word Problem (MWP) generation as an auto-regressive text generation problem. We evaluate the pre-trained sequence-to-sequence language models (mBART and mT5) in the context of two low-resource languages, Sinhala and Tamil, as well as English. For the evaluation, we create a multi-way parallel MWP dataset for the considered languages. Our empirical evaluation analyses how the performance of the pre-trained models is affected by the (1) amount of language data used during pre-training, (2) amount of data used in finetuning, (3) input seed length and (4) context differences in MWPs. Our results reveal that the considered pre-trained models are capable of generating meaningful MWPs even for the languages under-represented in these models, even though the amount of fine-tuning data and seed length are small. Our human evaluation shows that a Mathematics tutor can edit a generation question fairly easily, thus highlighting the practical utility ofautomatically generating MWPs.
CITATION STYLE
Niyarepola, K., Athapaththu, D., Ekanayake, S., & Ranathunga, S. (2022). Math Word Problem Generation with Multilingual Language Models. In 15th International Natural Language Generation Conference, INLG 2022 (pp. 144–155). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.inlg-main.12
Mendeley helps you to discover research relevant for your work.