In this paper we present a new method for color segmentation of complex document images which can be used as a preprocessing step of a text information extraction application. From the edge map of an image, we choose a representative set of samples of the input color image and built the 3D histogram of the RGB color space. These samples are used to locate a relatively large number of proper points in the 3D color space and use them in order to initially reduce the colors. From this step an oversegmented image is produced which usually has no more than 100 colors. To extract the final result, a mean shift procedure starts from the calculated points and locates the final color clusters of the RGB color distribution. Also, to overcome noise problems, a proposed edge preserving smoothing filter is used to enhance the quality of the image. Experimental results showed the method's capability of producing correctly segmented complex color documents while removing background noise or low contrast objects which is very desirable in text information extraction applications. Additionally, our method has the ability to cluster randomly shaped distributions. © Springer-Verlag Berlin Heidelberg 2007.
CITATION STYLE
Nikolaou, N., & Papamarkos, N. (2007). Color Segmentation of Complex Document Images. In Communications in Computer and Information Science (Vol. 4 CCIS, pp. 251–263). Springer Verlag. https://doi.org/10.1007/978-3-540-75274-5_17
Mendeley helps you to discover research relevant for your work.