Language corpora: The Czech case

František Čermák

Conference Proceedings

Language corpora: The Czech case

Čermák F

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2001) 2166 21-30

DOI: 10.1007/3-540-44805-5_3

2Citations

1Readers

Get full text

Abstract

Against background of the growing need of information, which for language used to be supplied in a rather limited way, the new solution found in language corpora and the way how this has been implemented is outlined and discussed. For the Czech language, this solution has materialized in the 100 million representative Czech National Corpus (CNC, 2000). In the following, a brief tour is offered through various stages of its build-up, characterizing both various corpora within CNC and giving some figures about proportions of various types of language represented. The last part of the contribution sets a minimal programme for further research and desiderata to be followed in general in this branch of important and international stream of modern science.

Cite

CITATION STYLE

APA

Čermák, F. (2001). Language corpora: The Czech case. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2166, pp. 21–30). Springer Verlag. https://doi.org/10.1007/3-540-44805-5_3

Language corpora: The Czech case

Abstract

Cite

Register to see more suggestions