Encoding diachrony: Digital editions of Serbian 18th-century texts

Toma Tasovac; Natalia Ermolaev

Conference Proceedings

Encoding diachrony: Digital editions of Serbian 18th-century texts

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2011) 6966 LNCS 497-500

DOI: 10.1007/978-3-642-24469-8_58

1Citations

8Readers

Get full text

Abstract

Texts in the "Digital Library of Serbian Cultural Heritage of the 18th Century" are encoded as a word-aligned corpus of TEI XML documents in two versions: one using traditional 18th-century orthography, including the graphemes which have since disappeared from Serbian, and one using modernized and standardized Serbian spelling rules that increase the legibility and searchability of these texts for modern users. The corpus also contains linguistic and semantic annotations that add modern phonetic, morphological, lexical and conceptual equivalents to the largely archaic vocabulary. By applying basic techniques of cross-lingual information retrieval to a historical dimension of one language, and making provisions for multiple indexing and annotations, our project exposes a notoriously difficult chapter in the development of the Serbian language to a wider audience, without sacrificing the edition's scholarly potential. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Tasovac, T., & Ermolaev, N. (2011). Encoding diachrony: Digital editions of Serbian 18th-century texts. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6966 LNCS, pp. 497–500). https://doi.org/10.1007/978-3-642-24469-8_58

Encoding diachrony: Digital editions of Serbian 18th-century texts

Abstract

Author supplied keywords

Cite

Register to see more suggestions