Techniques of czech language lossless text compression

Jiří Ševčík; Jiří Dvorský

Conference ProceedingsOPEN ACCESS

Techniques of czech language lossless text compression

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9842 LNCS 265-276

DOI: 10.1007/978-3-319-45378-1_24

0Citations

2Readers

Abstract

For lossless data compression of the texts of natural language and for achieving better compression ratio we can use linguistic and grammatical properties extracted from the text analysis. This work deals with usage of word order, word categories and grammatical rules in sentences and sentence units in Czech language. Special grammatical properties of this language which are different from for example English language are used here. Further, there is an algorithm designed for searching similarities in analyzed sentence structures and its next processing to final compressed file. For analysis of the sentence units a special tool is used which allows parsing on more levels.

Author supplied keywords

Cite

CITATION STYLE

APA

Ševčík, J., & Dvorský, J. (2016). Techniques of czech language lossless text compression. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9842 LNCS, pp. 265–276). Springer Verlag. https://doi.org/10.1007/978-3-319-45378-1_24

Techniques of czech language lossless text compression

Abstract

Author supplied keywords

Cite

Register to see more suggestions