Vowel normalization by frequency warped spectral matching

  • Matsumoto H
  • Wakita H
  • 6

    Readers

    Mendeley users who have this article in their library.
  • 3

    Citations

    Citations of this article.

Abstract

Normalization of formant frequencies have frequently been used to eliminate inter-speaker differences in vowel recognition. However, estimation of formant frequencies becomes difficult under certain circumstances, such as for telephone speech. This paper presents an approach to vowel normalization based on frequency warped spectral matching. A frequency normalized distance between test and reference spectra is defined on the basis of the minimum mean square difference over all possible choices of frequency warping functions under certain nonlinearity constraints and boundary conditions. After adaptively eliminating spectral slope differences due to the individual glottal characteristics, the spectral distance is computed by means of dynamic programming. The vowel identification experiments were conducted on the nine American English vowels in /hvd/ utterances spoken by 12 male and 12 female speakers. The results indicated that the frequency warping method substantially increased the identification scores for female vowels when the male vowels were used as reference. They also indicated that although the improvement in identification was attributed mainly to the linear frequency scaling, an additional improvement for vowel /ae/ was obtained by a slight nonlinear frequency warping. In addition, an application to speaker normalization for word detection in connected speech is discussed. © 1986.

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • Hiroshi Matsumoto

  • Hisashi Wakita

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free