CLEAR-Simple Corpus for Medical French

43Citations
Citations of this article
73Readers
Mendeley users who have this article in their library.

Abstract

Availability of corpora with technical and simplified contents is crucial for the development and test of methods for text simplification. We describe this kind of corpus for the French medical language. The corpus contains texts from three sources: Encyclopedia, drug leaflets and scientific summaries. Each source proposes comparable information in specialized and plain languages. A subset of this corpus has been processed manually in order to find and align parallel sentences. This subset currently contains 663 pairs with parallel sentences. Alignment has been done by two annotators and shows 0.76 inter-annotator agreement. The corpus with comparable data is available for research.

Cite

CITATION STYLE

APA

Grabar, N., & Cardon, R. (2018). CLEAR-Simple Corpus for Medical French. In ATA 2018 - 1st Workshop on Automatic Text Adaptation, Proceedings of the Workshop (pp. 3–9). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w18-7002

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free