The objective of the presentation is to report experiments involving the automatic classification of disordered connected speech into binary (normal, pathological) or multiple (modal, moderately hoarse, severely hoarse) categories. The multi-category classification according to the perceived degree of hoarseness is considered to be clinically meaningful and desirable given that the reliable perceptual classification by humans of disordered voice stimuli is known to be difficult and time-consuming. The acoustic cues are temporal signal-to-dysperiodicity ratios as well as mel-frequency cepstral coefficients. The classifiers are support vector machines which have been trained and tested on two connected speech corpora. The binary classification accuracy has been high (98%) for both sets of acoustic cues. The multi-category classification accuracy has been 70% when based on signal-to-dysperiodicity ratios and 59% when based on mel-frequency cepstral coefficients. © 2010 ISCA.
CITATION STYLE
Alpan, A., Schoentgen, J., Maryn, Y., & Grenez, F. (2010). Automatic perceptual categorization of disordered connected speech. In Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 (pp. 2574–2577). International Speech Communication Association. https://doi.org/10.21437/interspeech.2010-696
Mendeley helps you to discover research relevant for your work.