Retrieval from OCR text: RISOT track

Kripabandhu Ghosh; Swapan Kumar Parui

Conference Proceedings

Retrieval from OCR text: RISOT track

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 7536 LNCS 214-226

DOI: 10.1007/978-3-642-40087-2_21

3Citations

1Readers

Get full text

Abstract

In this paper, we present our work in the RISOT track of FIRE 2011. Here, we describe an error modeling technique for OCR errors in an Indic script. Based on the error model, we apply a two-fold error correction method on the OCRed corpus. First, we correct the corpus by correction with full confidence and correction without full confidence approaches. Finally, we use query expansion for error correction. We have achieved retrieval results which are significantly better than the baseline and the difference between our best result and the original text run is not significant.

Cite

CITATION STYLE

APA

Ghosh, K., & Parui, S. K. (2013). Retrieval from OCR text: RISOT track. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7536 LNCS, pp. 214–226). https://doi.org/10.1007/978-3-642-40087-2_21

Retrieval from OCR text: RISOT track

Abstract

Cite

Register to see more suggestions