CzEng 1.6: Enlarged Czech-English parallel corpus with processing tools dockered

46Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We present a new release of the Czech-English parallel corpus CzEng. CzEng 1.6 consists of about 0.5 billion words (“gigaword”) in each language. The corpus is equipped with automatic annotation at a deep syntactic level of representation and alternatively in Universal Dependencies. Additionally, we release the complete annotation pipeline as a virtual machine in the Docker virtualization toolkit.

Cite

CITATION STYLE

APA

Bojar, O., Dušek, O., Kocmi, T., Libovický, J., Novák, M., Popel, M., … Variš, D. (2016). CzEng 1.6: Enlarged Czech-English parallel corpus with processing tools dockered. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9924 LNCS, pp. 231–238). Springer Verlag. https://doi.org/10.1007/978-3-319-45510-5_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free