A deep learning approach to speech recognition of digits

0Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

One of the technologies gaining an increasing popularity in recent years has been speech recognition. This technology has a widespread user base ranging from organizations to individuals for the various benefits it provides. Today, there are a great deal of virtual voice assistants in the market- Siri, Cortana and Alexa, to name a few. However, they all require an active internet connection and aren’t supported on all devices. We have built a digit recognition system that works offline on desktop and mobile devices. This speech-to-text system can recognize a sequence of digits spoken between 0 and 9 and distinguish variations such as “double two” and “triple six”. Our approach involves recording a digit sequence audio as input and pre-processing it by extracting the peak amplitudes, followed by Mel Frequency Cepstral Coefficients (MFCC) feature extraction and finally feeding the feature vector to an artificial neural network that outputs the most probable class. We then exported the model to a minimized configuration that is simple to use on mobile platform. We obtained an accuracy of 87% for the validation set and 86% for the test set.

Cite

CITATION STYLE

APA

Gopinath, G., Kumar, J. K., Shetty, N., & Shylaja, S. S. (2019). A deep learning approach to speech recognition of digits. In Communications in Computer and Information Science (Vol. 1045, pp. 117–126). Springer Verlag. https://doi.org/10.1007/978-981-13-9939-8_11

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free