A deep learning approach to speech recognition of digits

Gagan Gopinath; Joel Kiran Kumar; Nirmit Shetty; S. S. Shylaja

Conference Proceedings

A deep learning approach to speech recognition of digits

Communications in Computer and Information Science (2019) 1045 117-126

DOI: 10.1007/978-981-13-9939-8_11

0Citations

8Readers

Get full text

Abstract

One of the technologies gaining an increasing popularity in recent years has been speech recognition. This technology has a widespread user base ranging from organizations to individuals for the various benefits it provides. Today, there are a great deal of virtual voice assistants in the market- Siri, Cortana and Alexa, to name a few. However, they all require an active internet connection and aren’t supported on all devices. We have built a digit recognition system that works offline on desktop and mobile devices. This speech-to-text system can recognize a sequence of digits spoken between 0 and 9 and distinguish variations such as “double two” and “triple six”. Our approach involves recording a digit sequence audio as input and pre-processing it by extracting the peak amplitudes, followed by Mel Frequency Cepstral Coefficients (MFCC) feature extraction and finally feeding the feature vector to an artificial neural network that outputs the most probable class. We then exported the model to a minimized configuration that is simple to use on mobile platform. We obtained an accuracy of 87% for the validation set and 86% for the test set.

Author supplied keywords

Cite

CITATION STYLE

APA

Gopinath, G., Kumar, J. K., Shetty, N., & Shylaja, S. S. (2019). A deep learning approach to speech recognition of digits. In Communications in Computer and Information Science (Vol. 1045, pp. 117–126). Springer Verlag. https://doi.org/10.1007/978-981-13-9939-8_11

A deep learning approach to speech recognition of digits

Abstract

Author supplied keywords

Cite

Register to see more suggestions