Abstract
Data analytics helps companies to analyze customer trends, make better business decisions and optimize their performances. Scanned document analysis is an important step in data analytics. Automatically extracting information from a scanned receipt has potential applications in industries. Both printed and handwritten letters are present in a receipt. Often these receipt documents are of low resolution due to paper damage and poor scanning quality. So, correctly recognizing each letter is a challenge. This work focuses on building an improved Convolutional Neural Network (CNN) model with regularization technique for classifying all English characters (both uppercase and lowercase) and numbers from 0 to 9. The training data contains about 60000 images of letters (English alphabets and numbers).This training data consists of letter images from windows true type (.ttf ) files and from different scanned receipts. We developed different CNN models for this 62 class classification problem, with different regularization and dropout techniques. Hyperparameters of Convolutional Neural Network are adjusted to obtain the optimum accuracy. Different optimization methods are considered to obtain better accuracy. Performance of each CNN model is analyzed in terms of accuracy, precision value, recall value, F1 score and confusion matrix to find out the best model. Prediction error of the model is calculated for Gaussian noise and impulse noise at different noise levels.
Author supplied keywords
Cite
CITATION STYLE
Vincent, E. J., & Hari, V. S. (2023). Classification of Letter Images from Scanned Invoices using CNN. International Journal of Computing, 22(3), 360–366. https://doi.org/10.47839/ijc.22.3.3232
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.