Speaker-identifying features based on formant tracks

  • Goldstein U
28Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The formant structure of three diphthongs, four tense vowels, and three retroflex sounds was examined in detail for possible speaker-identifying features. These sounds were spoken five times each in sentence context by ten speakers of General American on one day and by six of the speakers on a second day at least three weeks later. Formant tracks were computed for each sound under investigation using covariance-type pitch-asynchronous linear prediction together with a root-finding algorithm. The interspeaker variability of about 200 measurements made on these formant tracks was compared initially with intraspeaker variability through the calculation of F ratios. Those with average F ratios greater than 60 were evaluated further with a probability-of-error criterion. Features that are potentially most effective in identifying speakers are the minimum second-formant value in [-r], the maximum first-formant value in [-r], the maximum second-formant values of [o] and [-I], and the minimum third-formant value of [-]. The individual differences apparent in these sounds presumably depend more on speaker habits than on vocal-tract anatomy. The error bound predicted for a speaker identification procedure based on these five features is 0.24%. An identification experiment using only the best two features gave 12 errors out of 80 identifications.Subject Classification: [43]70.65, [43]70.40.

Cite

CITATION STYLE

APA

Goldstein, U. G. (1976). Speaker-identifying features based on formant tracks. The Journal of the Acoustical Society of America, 59(1), 176–182. https://doi.org/10.1121/1.380837

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free