Speech signal modeling using multivariate distributions

7Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Using a proper distribution function for speech signal or for its representations is of crucial importance in statistical-based speech processing algorithms. Although the most commonly used probability density function (pdf) for speech signals is Gaussian, recent studies have shown the superiority of super-Gaussian pdfs. A large research effort has focused on the investigation of a univariate case of speech signal distribution; however, in this paper, we study the multivariate distributions of speech signal and its representations using the conventional distribution functions, e.g., multivariate Gaussian and multivariate Laplace, and the copula-based multivariate distributions as candidates. The copula-based technique is a powerful method in modeling non-Gaussian multivariate distributions with non-linear inter-dimensional dependency. The level of similarity between the candidate pdfs and the real speech pdf in different domains is evaluated using the energy goodness-of-fit test. In our evaluations, the best-fitted distributions for speech signal vectors with different lengths in various domains are determined. A similar experiment is performed for different classes of English phonemes (fricatives, nasals, stops, vowels, and semivowel/glides). The evaluation results demonstrate that the multivariate distribution of speech signals in different domains is mostly super-Gaussian, except for Mel-frequency cepstral coefficient. Also, the results confirm that the distribution of the different phoneme classes is better statistically modeled by a mixture of Gaussian and Laplace pdfs. The copula-based distributions provide better statistical modeling of vectors representing discrete Fourier transform (DFT) amplitude of speech vectors with a length shorter than 500 ms.

Cite

CITATION STYLE

APA

Aroudi, A., Veisi, H., Sameti, H., & Mafakheri, Z. (2015). Speech signal modeling using multivariate distributions. Eurasip Journal on Audio, Speech, and Music Processing, 2015(1), 1–14. https://doi.org/10.1186/s13636-015-0078-1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free