Character encoding of classical languages

5Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Underlying any processing and analysis of texts is the need to represent the individual characters that make up those texts. For the first few decades, scholars pioneering digital classical philology had to adopt various workaround for dealing with the various scripts of historical languages on systems that were never intended for anything but English. The Unicode Standard addresses many of the issues with character encoding across the world's writing systems, including those used by historical languages, but its practical use in digital classical philology is not without challenges. This chapter will start with a conceptual overview of character coding systems and the Unicode Standard in particular but will discuss practical issues relating to the input, interchange, processing and display of classical texts. As well as providing guidelines for interoperability in text representation, various aspects of text processing at the character level will be covered including normalisation, search, regular expressions, collation, and alignment.

Cite

CITATION STYLE

APA

Tauber, J. K. (2019). Character encoding of classical languages. In Digital Classical Philology: Ancient Greek and Latin in the Digital Revolution (pp. 137–57). De Gruyter. https://doi.org/10.1515/9783110599572-009

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free