Character encoding of classical languages

James K. Tauber

Book ChapterOPEN ACCESS

Character encoding of classical languages

Tauber J

De Gruyter, (2019), 137-57

DOI: 10.1515/9783110599572-009

5Citations

6Readers

Get full text

Abstract

Underlying any processing and analysis of texts is the need to represent the individual characters that make up those texts. For the first few decades, scholars pioneering digital classical philology had to adopt various workaround for dealing with the various scripts of historical languages on systems that were never intended for anything but English. The Unicode Standard addresses many of the issues with character encoding across the world's writing systems, including those used by historical languages, but its practical use in digital classical philology is not without challenges. This chapter will start with a conceptual overview of character coding systems and the Unicode Standard in particular but will discuss practical issues relating to the input, interchange, processing and display of classical texts. As well as providing guidelines for interoperability in text representation, various aspects of text processing at the character level will be covered including normalisation, search, regular expressions, collation, and alignment.

Cite

CITATION STYLE

APA

Tauber, J. K. (2019). Character encoding of classical languages. In Digital Classical Philology: Ancient Greek and Latin in the Digital Revolution (pp. 137–57). De Gruyter. https://doi.org/10.1515/9783110599572-009

Character encoding of classical languages

Abstract

Cite

Register to see more suggestions