Design and development of marathi speech interface system

2Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Speech is the most prominent and natural form of communication between humans. It has potential of being an important mode of interaction with computer. Man–machine interface has always been proven to be a challenging area in natural language processing and in speech recognition research. There are growing interests in developing machines that can accept speech as input. Normal person generally communicate with the computer through a mouse or keyboard. It requires training and hard work as well as knowledge about computer, which is a limitation at certain levels. Marathi is used as official language at government of Maharashtra. There is a need for developing systems that enable human–machine interaction in Indian regional languages. The objective of this research is to design and development of the Marathi speech Activated Talking Calculator (MSAC) as an interface system. The MSAC is speaker-dependent speech recognition system that is used to perform basic mathematical operation. It can recognize isolated spoken digit from 0 to 50 and basic operation like addition, subtraction, multiplication, start, stop, equal, and exit. Database is an essential requirement to design the speech recognition system. To reach up to the objectives set, a database having 22,320 sizes of vocabularies is developed. The MSAC system trained and tested using the Mel Frequency Cepstral Coefficients (MFCC), Linear Discriminative Analysis (LDA), Principal Component Analysis (PCA), Linear Predictive Codding (LPC), and Rasta-PLP individually. Training and testing of MSAC system are done with individually Mel Frequency Linear Discriminative Analysis (MFLDA), Mel Frequency Principal Component Analysis (MFPCA), Mel Frequency Discrete Wavelet Transformation (MFDWT), and Mel Frequency Linear Discrete Wavelet Transformation (MFLDWT) fusion feature extraction techniques. This experiment is proposed and tested the Wavelet Decomposed Cepstral Coefficient (WDCC) with 18, 36, and 54 coefficients approach. The performance of MSAC system is calculated on the basis of accuracy and real-time factor (RTF). From the experimental results, it is observed that the MFCC with 39 coefficients achieved higher accuracy than 13 and 26 variations. The MFLDWT is proven higher accuracy than MFLDA, MFPCA, MFDWT, and Mel Frequency Principal Discrete Wavelet Transformation (MFPDWT). From this research, we recommended that WDCC is robust and dynamic techniques than MFCC, LDA, PCA, and LPC. MSAC interface application is directly beneficial for society people for their day to day activity.

Cite

CITATION STYLE

APA

Gaikwad, S., Gawali, B., & Mehrotra, S. (2016). Design and development of marathi speech interface system. In Advances in Intelligent Systems and Computing (Vol. 396, pp. 3–20). Springer Verlag. https://doi.org/10.1007/978-81-322-2653-6_1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free