Math word problem (MWP) is challenging due to the limitation of training data where only one “standard” solution is provided. MWP models often fit the solution rather than truly understand or solve the problem. The generalization of models (to diverse word scenarios) is thus limited. To address this problem, we propose a novel approach we call TSN-MD that leverages a teacher network to integrate the knowledge of equivalent solution expressions such as to better regularize the learning behavior of the student network. In addition, we introduce the multiple-decoder student network to generate multiple candidate solution expressions by which the final answer is voted. In experiments, we conduct extensive comparisons and ablative studies on two large-scale MWP benchmarks, and show that using TSN-MD can surpass the state-of-the-art works by large margins. Intriguingly, the visualization results demonstrate that TSN-MD not only produces correct answers but also generates diverse equivalent expressions for the solution.
CITATION STYLE
Zhang, J., Lee, R. K. W., Lim, E. P., Qin, W., Wang, L., Shao, J., & Sun, Q. (2020). Teacher-student networks with multiple decoders for solving math word problem. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2021-January, pp. 4011–4017). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2020/555
Mendeley helps you to discover research relevant for your work.