WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages

13Citations
Citations of this article
25Readers
Mendeley users who have this article in their library.

Abstract

Automatic speech recognition systems are developed for translating the speech signals into the corresponding text representation. This translation is used in a variety of applications like voice enabled commands, assistive devices and bots, etc. There is a significant lack of efficient technology for Indian languages. In this paper, an wavelet transformer for automatic speech recognition (WTASR) of Indian language is proposed. The speech signals suffer from the problem of high and low frequency over different times due to variation in speech of the speaker. Thus, wavelets enable the network to analyze the signal in multiscale. The wavelet decomposition of the signal is fed in the network for generating the text. The transformer network comprises an encoder decoder system for speech translation. The model is trained on Indian language dataset for translation of speech into corresponding text. The proposed method is compared with other state of the art methods. The results show that the proposed WTASR has a low word error rate and can be used for effective speech recognition for Indian language.

Cite

CITATION STYLE

APA

Choudhary, T., Goyal, V., & Bansal, A. (2023). WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages. Big Data Mining and Analytics, 6(1), 85–91. https://doi.org/10.26599/BDMA.2022.9020017

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free