DNN-HMM acoustic modeling for large vocabulary Telugu speech recognition

8Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The main focus of this paper is towards the development of a large vocabulary Telugu speech database. Telugu is a low resource language where there exists no standardized database for building the speech recognition system (ASR). The database consists of neutral speech samples collected from 100 speakers for building the Telugu ASR system and it was named as IIIT-H Telugu speech corpus. The speech and text corpus design and the procedure followed for the collection of the database have been discussed in detail. The preliminary ASR system results for the models built in this database are reported. The architectural choices of deep neural networks (DNNs) play a crucial role in improving the performance of ASR systems. ASR trained with hybrid DNNs (DNN-HMM) with more hidden layers have shown better performance over the conventional GMMs (GMM-HMM). Kaldi tool kit is used for building the acoustic models required for the ASR system.

Author supplied keywords

Cite

CITATION STYLE

APA

Vegesna, V. V. R., Gurugubelli, K., Vydana, H. K., Pulugandla, B., Shrivastava, M., & Vuppala, A. K. (2017). DNN-HMM acoustic modeling for large vocabulary Telugu speech recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10682 LNAI, pp. 189–197). Springer Verlag. https://doi.org/10.1007/978-3-319-71928-3_19

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free