Improving Speech Recognition Rate through Analysis Parameters

Deividas Eringis; Gintautas Tamulevičius

Journal ArticleOPEN ACCESS

Improving Speech Recognition Rate through Analysis Parameters

Eringis D
Tamulevičius G

Electrical, Control and Communication Engineering (2014) 5(1) 61-66

DOI: 10.2478/ecce-2014-0009

N/ACitations

23Readers

Abstract

Speech signal is redundant and non-stationary by nature. Because of vocal tract inertness these variations are not very rapid and the signal can be considered as stationary in short segments. It is presumed that in short-time magnitude spectrum the most distinct information of speech is contained. This is the main reason for speech signal analysis in frame-by-frame manner. The analyzed speech signal is segmented into overlapping segments (so-called frames) for this purpose. Segments of 15-25 ms with the overlap of 10-15 ms are used usually.In this paper we present results of our investigation of analysis window length and frame shift influence on speech recognition rate. We have analyzed three different cepstral analysis approaches for this purpose: mel frequency cepstral analysis (MFCC), linear prediction cepstral analysis (LPCC) and perceptual linear prediction cepstral analysis (PLPC). The highest speech recognition rate was obtained using 10 ms length analysis window with the frame shift varying from 7.5 to 10 ms (regardless of analysis type). The highest increase of recognition rate was 2.5 %.

Cite

CITATION STYLE

APA

Eringis, D., & Tamulevičius, G. (2014). Improving Speech Recognition Rate through Analysis Parameters. Electrical, Control and Communication Engineering, 5(1), 61–66. https://doi.org/10.2478/ecce-2014-0009

Improving Speech Recognition Rate through Analysis Parameters

Abstract

Cite

Register to see more suggestions