Calor-quest : Generating a training corpus for machine reading comprehension models from shallow semantic annotations

Frédéric Béchet; Cindy Aloui; Delphine Charlet; Géraldine Damnati; Johannes Heinecke; Alexis Nasr; Frédéric Herlédan

Conference ProceedingsOPEN ACCESS

Calor-quest : Generating a training corpus for machine reading comprehension models from shallow semantic annotations

MRQA@EMNLP 2019 - Proceedings of the 2nd Workshop on Machine Reading for Question Answering (2019) 19-26

DOI: 10.18653/v1/d19-5803

5Citations

71Readers

Abstract

Machine reading comprehension is a task related to Question-Answering where questions are not generic in scope but are related to a particular document. Recently very large corpora (SQuAD, MS MARCO) containing triplets (document, question, answer) were made available to the scientific community to develop supervised methods based on deep neural networks with promising results. These methods need very large training corpus to be efficient, however such kind of data only exists for English and Chinese at the moment. The aim of this study is the development of such resources for other languages by proposing to generate in a semi-automatic way questions from the semantic Frame analysis of large corpora. The collect of natural questions is reduced to a validation/test set.We applied this method on the French CALOR-FRAME corpus to develop the CALOR-QUEST resource presented in this paper.

Cite

CITATION STYLE

APA

Béchet, F., Aloui, C., Charlet, D., Damnati, G., Heinecke, J., Nasr, A., & Herlédan, F. (2019). Calor-quest : Generating a training corpus for machine reading comprehension models from shallow semantic annotations. In MRQA@EMNLP 2019 - Proceedings of the 2nd Workshop on Machine Reading for Question Answering (pp. 19–26). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/d19-5803

Calor-quest : Generating a training corpus for machine reading comprehension models from shallow semantic annotations

Abstract

Cite

Register to see more suggestions