Head and neck cancer can significantly hamper speech production which often reduces speech intelligibility. A method of extracting spectral features is presented. The method uses a multi-resolution sinusoidal transform scheme, which enables better representation of spectral and harmonic characteristics. Regression methods were used to predict interval-scaled intelligibility scores of utterances in the NKI-CCRT speech corpus. The inclusion of these features lowered the mean squared estimation error from 0.43 to 0.39 on a scale from 1 to 7, with a p-value less than 0.001. For binary intelligibility classification, their inclusion resulted in an improvement by 5.0 percentage points when tested on a disjoint set.
CITATION STYLE
Kim, J. C., Rao, H., & Clements, M. A. (2014). Speech intelligibility estimation using multi-resolution spectral features for speakers undergoing cancer treatment. The Journal of the Acoustical Society of America, 136(4), EL315–EL321. https://doi.org/10.1121/1.4896410
Mendeley helps you to discover research relevant for your work.