OCR Evaluation Tools for the 21st Century

  • Santos E
N/ACitations
Citations of this article
81Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We introduce ocreval, a port of the ISRI OCR Evaluation Tools, now with Unicode support. We describe how we upgraded the ISRI OCR Evaluation Tools to support modern text processing tasks. ocreval supports producing character-level and word-level accuracy reports, supporting all characters representable in the UTF-8 character encoding scheme. In addition, we have implemented the Unicode default word boundary specification in order to support word-level accuracy reports for a broad range of writing systems. We argue that character-level and word-level accuracy reports produce confusion matrices that are useful for tasks beyond OCR evaluation—including tasks supporting the study and computational modeling of endangered languages.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Santos, E. A. (2019). OCR Evaluation Tools for the 21st Century. Proceedings of the Workshop on Computational Methods for Endangered Languages. https://doi.org/10.33011/computel.v1i.345

Readers over time

‘19‘20‘21‘22‘23‘24‘2506121824

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 19

66%

Researcher 6

21%

Professor / Associate Prof. 2

7%

Lecturer / Post doc 2

7%

Readers' Discipline

Tooltip

Computer Science 22

69%

Linguistics 5

16%

Engineering 4

13%

Neuroscience 1

3%

Save time finding and organizing research with Mendeley

Sign up for free
0