In large scale document digitization, orientation detection plays an important role, especially in the scenario of digitizing incoming mail. The heavy use of automatic document feeding scanners and moreover automatic processing of facsimiles results in many documents being scanned in the wrong orientation. These misoriented scans have to be corrected, as most subsequent processing steps assume the document to be scanned in the right orientation. Several existing methods for orientation detection use the fact that in Latin script text, ascenders are more likely to occur than descenders. In this paper, we propose a one-step skew and orientation detection method using a well-established geometric text-line model. The advantage of our method is that it combines accurate skew estimation with robust, resolution-independent orientation detection. An interesting aspect of our method is that it incorporates orientation detection into a previously published skew detection method allowing to perform orientation detection, skew estimation, and, if necessary, text-line extraction in one step. The effectiveness of our orientation detection approach is demonstrated on the UW-I dataset, and on publicly available test images from OCRopus. Our method achieves an accuracy of 99% on the UW-I dataset and 100% on test images from OCRopus. © 2009 Springer-Verlag.
CITATION STYLE
van Beusekom, J., Shafait, F., & Breuel, T. M. (2010). Combined orientation and skew detection using geometric text-line modeling. International Journal on Document Analysis and Recognition, 13(2), 79–92. https://doi.org/10.1007/s10032-009-0109-5
Mendeley helps you to discover research relevant for your work.