Spoken Digit Recognition Using Time-Frequency Pattern Matching

  • Denes P
  • Mathews M
30Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

A study of the machine recognition of the spoken digits zero through nine has been carried out by a digital computer simulation. The spoken utterances were converted to time-frequency patterns of spectral energy. Recognition was done by cross correlating the pattern of an unknown utterance with a test pattern for each digit and selecting the digit having the highest correlation. Time normalization could be applied to all patterns, thus reducing utterances to a standard duration. Six male and one female speakers provided 38 samples of each of the 10 digits. Pauses were made between successive words for segmentation.No errors were observed recognizing a single speaker using test patterns from his own speech with time normalization. A group of five male speakers and test patterns averaged over the group produced 6% errors with time normalization and 12% without. A 25% rate occurred for the woman matched against male patterns.The study indicates both the effectiveness and limitations of this simple recognition procedure for limited vocabulary and limited number of speakers. Time normalization improves performance in all cases.

Cite

CITATION STYLE

APA

Denes, P., & Mathews, M. V. (1960). Spoken Digit Recognition Using Time-Frequency Pattern Matching. The Journal of the Acoustical Society of America, 32(11), 1450–1455. https://doi.org/10.1121/1.1907936

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free