Multi-hop question generation requires complex reasoning and coherent language realization. Learning a generation model for the problem requires extensive multi-hop question answering (QA) data, which are limited due to the manual collection effort. A two-phase strategy addresses the insufficiency of multi-hop QA data by first generating and then composing single-hop sub-questions. Learning this generating and then composing two-phase model, however, requires manually labeled question decomposition data, which is labor intensive. To overcome this limitation, we propose a novel generative approach that optimizes the two-phase model without question decomposition data. We treat the unobserved sub-questions as latent variables and propose an objective that estimates the true sub-questions via variational inference. We further generalize the generative modeling to single-hop QA data. We hypothesize that each single-hop question is a sub-question of an unobserved multi-hop question, and propose an objective that generates single-hop questions by decomposing latent multi-hop questions. We show that the two objectives can be unified and both optimize the two-phase generation model. Experiments show that the proposed approach outperforms competitive baselines on HOTPOTQA, a benchmark multi-hop question answering dataset.
CITATION STYLE
Huang, X., Qi, J., Sun, Y., & Zhang, R. (2021). Latent Reasoning for Low-Resource Question Generation. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 3008–3022). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.265
Mendeley helps you to discover research relevant for your work.