JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving

5Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Although pre-trained language models∼(PLMs) have recently advanced the research progress in mathematical reasoning, they are not specially designed as a capable multi-task solver, suffering from high cost for multi-task deployment (e.g. a model copy for a task) and inferior performance on complex mathematical problems in practical applications. To address these issues, we propose JiuZhang 2.0, a unified Chinese PLM specially for multi-task mathematical problem solving. Our idea is to maintain a moderate-sized model and employ the cross-task knowledge sharing to improve the model capacity in a multi-task setting. Specially, we construct a Mixture-of-Experts (MoE) architecture for modeling mathematical text, to capture the common mathematical knowledge across tasks. For optimizing the MoE architecture, we design multi-task continual pre-training and multi-task fine-tuning strategies for multi-task adaptation. These training strategies can effectively decompose the knowledge from the task data and establish the cross-task sharing via expert networks. To further improve the general capacity of solving different complex tasks, we leverage large language models (LLMs) as complementary models to iteratively refine the generated solution by our PLM, via in-context learning. Extensive experiments have demonstrated the effectiveness of our model.

Cite

CITATION STYLE

APA

Zhao, X., Zhou, K., Zhang, B., Gong, Z., Chen, Z., Zhou, Y., … Hu, G. (2023). JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 5660–5672). Association for Computing Machinery. https://doi.org/10.1145/3580305.3599850

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free