@PhilosTEI: Building Corpora for Philosophers

  • van Hessen A
N/ACitations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

The step to e-research in philosophy depends on the availability of high quality, easily and freely accessible corpora in a sustainable format composed from multi-language, multi-script books from different historical periods. Corpora matching these needs are at the moment virtually non-existing. Within @PhilosTei, we have addressed this corpus building problem by developing an open source, web-based, user-friendly workflow from textual images to TEI, based on state-of-the-art open source OCR software, to wit Tesseract, and a multi-language version of TICCL, a powerful OCR post-correction tool. We have demonstrated the utility of the tool by applying it to a multilingual, multi-script corpus of important eighteenth to twentieth-century European philosophical texts.

Cite

CITATION STYLE

APA

van Hessen, A. (2017). @PhilosTEI: Building Corpora for Philosophers. In CLARIN in the Low Countries (pp. 379–392). Ubiquity Press. https://doi.org/10.5334/bbi.32

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free