Parcolab: A parallel corpus for serbian, french and english

Aleksandra Miletic; Dejan Stosic; Saša Marjanović

Conference Proceedings

Parcolab: A parallel corpus for serbian, french and english

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2017) 10415 LNAI 156-164

DOI: 10.1007/978-3-319-64206-2_18

6Citations

5Readers

Get full text

Abstract

ParCoLab is a trilingual parallel corpus containing texts in Serbian, French and English. It is developed at the CLLE-ERSS research unit (UMR 5263 CNRS) at the University of Toulouse, France, in collaboration with the Department of Romance Studies at the University of Belgrade, Serbia. Serbian being one of the less-resourced European languages, this is an important step towards the creation of freely accessible corpora and NLP tools for this language. Our main goal is to provide the scientific community with a high-quality resource that can be used in a wide range of applications, such as contrastive linguistic studies, NLP research, machine and computer assisted translation, translation studies, second language learning and teaching, and applied lexicography. The corpus currently contains 7.1M tokens mainly from literary works, but corpus extension and diversification efforts are ongoing. ParCoLab can be queried online and a part of it is available for download.

Author supplied keywords

Cite

CITATION STYLE

APA

Miletic, A., Stosic, D., & Marjanović, S. (2017). Parcolab: A parallel corpus for serbian, french and english. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10415 LNAI, pp. 156–164). Springer Verlag. https://doi.org/10.1007/978-3-319-64206-2_18

Parcolab: A parallel corpus for serbian, french and english

Abstract

Author supplied keywords

Cite

Register to see more suggestions