FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domain

10Citations
Citations of this article
26Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper introduces FrenchMedMCQA, the first publicly available Multiple-Choice Question Answering (MCQA) dataset in French for medical domain. It is composed of 3,105 questions taken from real exams of the French medical specialization diploma in pharmacy, mixing single and multiple answers. Each instance of the dataset contains an identifier, a question, five possible answers and their manual correction(s). We also propose first baseline models to automatically process this MCQA task in order to report on the current performances and to highlight the difficulty of the task. A detailed analysis of the results showed that it is necessary to have representations adapted to the medical domain or to the MCQA task: in our case, English specialized models yielded better results than generic French ones, even though FrenchMedMCQA is in French. Corpus, models and tools are available online.

Cite

CITATION STYLE

APA

Labrak, Y., Bazoge, A., Dufour, R., Daille, B., Gourraud, P. A., Morin, E., & Rouvier, M. (2022). FrenchMedMCQA: A French Multiple-Choice Question Answering Dataset for Medical domain. In LOUHI 2022 - 13th International Workshop on Health Text Mining and Information Analysis, Proceedings of the Workshop (pp. 41–46). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.louhi-1.5

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free