Speaker-identifying features based on formant tracks

Ursula G. Goldstein

Journal ArticleOPEN ACCESS

Speaker-identifying features based on formant tracks

Goldstein U

The Journal of the Acoustical Society of America (1976) 59(1) 176-182

DOI: 10.1121/1.380837

28Citations

7Readers

Get full text

Abstract

The formant structure of three diphthongs, four tense vowels, and three retroflex sounds was examined in detail for possible speaker-identifying features. These sounds were spoken five times each in sentence context by ten speakers of General American on one day and by six of the speakers on a second day at least three weeks later. Formant tracks were computed for each sound under investigation using covariance-type pitch-asynchronous linear prediction together with a root-finding algorithm. The interspeaker variability of about 200 measurements made on these formant tracks was compared initially with intraspeaker variability through the calculation of F ratios. Those with average F ratios greater than 60 were evaluated further with a probability-of-error criterion. Features that are potentially most effective in identifying speakers are the minimum second-formant value in [-r], the maximum first-formant value in [-r], the maximum second-formant values of [o] and [-I], and the minimum third-formant value of [-]. The individual differences apparent in these sounds presumably depend more on speaker habits than on vocal-tract anatomy. The error bound predicted for a speaker identification procedure based on these five features is 0.24%. An identification experiment using only the best two features gave 12 errors out of 80 identifications.Subject Classification: [43]70.65, [43]70.40.

Cite

CITATION STYLE

APA

Goldstein, U. G. (1976). Speaker-identifying features based on formant tracks. The Journal of the Acoustical Society of America, 59(1), 176–182. https://doi.org/10.1121/1.380837

Speaker-identifying features based on formant tracks

Abstract

Cite

Register to see more suggestions