Text and non text scene image classification for visually impaired through alexnet transfer learning model

ISSN: 22773878
0Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.

Abstract

Natural Scene Image based text Recognition is a prevalent and exciting research field in computer vision in recent years. For visually impaired people there are some assistive devices which make them to sense the scene images through text extraction. The first and crucial task of the assistive device for text extraction is to detect the text in scene images. This paper proposes a transfer learning based approach with pre-trained CNN model to classify the text and non-text images. AlexNet is the pre-trained architecture that is used as binary classifier. The first 5 convolution layers of the AlexNet are freezed. The last 3 layers are fully connected layers, in which the final output layer is modified to size 2, as this is the binary classifier. The images in the dataset must be preprocessed either before training or testing. The preprocessing consists of Denoising and Augmentation. Denoising removes the noise in the input image using Denoising Convolution Neural Network (DnCNN). Data Augmentation includes image resizing, because AlexNet only accepts the RBG images of size 256x256. The proposed model has achieved the accuracy of 99% in classifying the test dataset.

Cite

CITATION STYLE

APA

Anilkumar, B., Velaga, S. M., & Aswani Devi, A. (2019). Text and non text scene image classification for visually impaired through alexnet transfer learning model. International Journal of Recent Technology and Engineering, 8(1), 1125–1129.

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free