Rejection threshold estimation for an unknown language model in an ocr task

Joaquim Arlandis; Juan Carlos Perez-Cortes; J. Ramon Navarro-Cerdan; Rafael Llobet

Conference ProceedingsOPEN ACCESS

Rejection threshold estimation for an unknown language model in an ocr task

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2010) 6218 LNCS 738-747

DOI: 10.1007/978-3-642-14980-1_73

2Citations

3Readers

Abstract

In an OCR post-processing task, a language model is used to find the best transformation of the OCR hypothesis into a string compatible with the language. The cost of this transformation is used as a confidence value to reject the strings that are less likely to be correct, and the error rate of the accepted strings should be strictly controlled by the user. In this work, the expected error rate distribution of an unknown language model is estimated from a training set composed of known language models. This means that after building a new language model, the user should be able to automatically "fix" the expected error rate at an acceptable level instead of having to deal with an arbitrary threshold. © 2010 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Arlandis, J., Perez-Cortes, J. C., Navarro-Cerdan, J. R., & Llobet, R. (2010). Rejection threshold estimation for an unknown language model in an ocr task. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6218 LNCS, pp. 738–747). https://doi.org/10.1007/978-3-642-14980-1_73

Rejection threshold estimation for an unknown language model in an ocr task

Abstract

Author supplied keywords

Cite

Register to see more suggestions