Learning to Generate Questions by Learning to Recover Answer-containing Sentences

Seohyun Back; Akhil Kedia; Sai Chetan Chinthakindi; Haejun Lee; Jaegul Choo

Conference ProceedingsOPEN ACCESS

Learning to Generate Questions by Learning to Recover Answer-containing Sentences

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (2021) 1516-1529

DOI: 10.18653/v1/2021.findings-acl.132

14Citations

54Readers

Abstract

To train a question answering model based on machine reading comprehension (MRC), significant effort is required to prepare annotated training data composed of questions and their answers from contexts. Recent research has focused on synthetically generating a question from a given context and an annotated (or generated) answer by training an additional generative model to augment the training data. In light of this research direction, we propose a novel pre-training approach that learns to generate contextually rich questions, by recovering answer-containing sentences. We evaluate our method against existing ones in terms of the quality of generated questions, and fine-tuned MRC model accuracy after training on the data synthetically generated by our method. We consistently improve the question generation capability of existing models such as T5 and UniLM, and achieve state-of-the-art results on MS MARCO and NewsQA, and comparable results to the state-of-the-art on SQuAD. Additionally, the data synthetically generated by our approach is beneficial for boosting up the downstream MRC accuracy across a wide range of datasets, such as SQuAD-v1.1, v2.0, KorQuAD and BioASQ, without any modification to the existing MRC models. Furthermore, our method shines especially when a limited amount of pre-training or downstream MRC data is given.

Cite

CITATION STYLE

APA

Back, S., Kedia, A., Chinthakindi, S. C., Lee, H., & Choo, J. (2021). Learning to Generate Questions by Learning to Recover Answer-containing Sentences. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 (pp. 1516–1529). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.findings-acl.132

Learning to Generate Questions by Learning to Recover Answer-containing Sentences

Abstract

Cite

Register to see more suggestions