Logo/brand name detection and recognition in unstructured and highly unpredictable natural images has always been a challenging problem. We notice that in most natural images logos are accompanied with associated text. Therefore, we address the problem of logo recognition by first detecting and isolating text of varying color, font size and orientation in the input image using affine invariant maximally stable extremal regions (MSERs). Using an off-the-shelf OCR, we identify the text associated with the logo image. Then an effective grouping technique is employed to combine the remaining stable regions based on spatial proximity of MSERs. Deep learning has the advantage that optimal features can be learned automatically from image pixel data. This motivates us to feed the clustered logo candidate image regions to a pre-trained deep convolutional neural network (DCNN) to generate a set of complex features which are further input to a multiclass support vector machine (SVM) for classification. We tested our proposed logo recognition system on 32 logo classes, and a non-logo class obtained by combining FlickrLogos-32 and MICC logo databases, amounting to a total of 23582 training and testing images. Our method yields robust recognition performance, outperforming state-of-the-art techniques achieving 97.8% precision, 95.7% recall and 95.7% average accuracy on the combined MICC and FlickrLogos-32 datasets and a precision of 98.6%, recall of 97.9% and average accuracy of 99.6% on only the FlickrLogos-32 dataset.
CITATION STYLE
Medhi, M., Sinha, S., & Sahay, R. R. (2016). A text recognition augmented deep learning approach for logo identification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10481 LNCS, pp. 145–156). Springer Verlag. https://doi.org/10.1007/978-3-319-68124-5_13
Mendeley helps you to discover research relevant for your work.