We present an integrated model for using deep neural networks to solve street view number recognition problem. We didn’t follow the traditional way of first doing segmentation then perform recognition on isolated digits, but formulate the problem as a sequence recognition problem under probabilistic treatment. Our model leverage a deep Convolutional Neural Network(CNN) to represent the highly variable appearance of digits in natural images. Meanwhile, hiddenMarkovmodel(HMM) is used to deal with the dynamics of the sequence. They are combined in a hybrid fashion to form the hybrid CNN-HMM architecture. By using this model we can perform the training and recognition procedure both at word level. There is no explicit segmentation operation at all which save lots of labour of sophisticated segmentation algorithm design or finegrained character labeling. To the best of our knowledge, this is the first time using hybrid CNN-HMM model directly on the whole scene text images. Experiments show that deep CNN can dramaticly boost the performance compared with shallow Gausian Mixture Model(GMM)-HMM model. We obtaied competitive results on the street view house number(SVHN) dataset.
CITATION STYLE
Guo, Q., Tu, D., Lei, J., & Li, G. (2015). Hybrid CNN-HMM model for street view house number recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9008, pp. 303–315). Springer Verlag. https://doi.org/10.1007/978-3-319-16628-5_22
Mendeley helps you to discover research relevant for your work.