Over the past decade, large-scale supervised learning corpora have enabled machine learning researchers to make substantial advances. However, to this date, there are no large-scale questionanswer corpora available. In this paper we present the 30M Factoid Question- Answer Corpus, an enormous questionanswer pair corpus produced by applying a novel neural network architecture on the knowledge base Freebase to transduce facts into natural language questions. The produced question-answer pairs are evaluated both by human evaluators and using automatic evaluation metrics, including well-established machine translation and sentence similarity metrics. Across all evaluation criteria the questiongeneration model outperforms the competing template-based baseline. Furthermore, when presented to human evaluators, the generated questions appear to be comparable in quality to real human-generated questions.
CITATION STYLE
Serban, I. V., García-Durán, A., Gulcehre, C., Ahn, S., Chandar, S., Courville, A., & Bengio, Y. (2016). Generating factoid questionswith recurrent neural networks: The 30M factoid question-answer corpus. In 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers (Vol. 1, pp. 588–598). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/p16-1056
Mendeley helps you to discover research relevant for your work.