Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain

Helia Relaño-Iborra; Tobias May; Johannes Zaar; Christoph Scheidiger; Torsten Dau

Journal ArticleOPEN ACCESS

Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain

Relaño-Iborra H
May T
Zaar J
et al.

The Journal of the Acoustical Society of America (2016) 140(4) 2670-2679

DOI: 10.1121/1.4964505

42Citations

61Readers

Abstract

A speech intelligibility prediction model is proposed that combines the auditory processing front end of the multi-resolution speech-based envelope power spectrum model [mr-sEPSM; Jørgensen, Ewert, and Dau (2013). J. Acoust. Soc. Am. 134(1), 436–446] with a correlation back end inspired by the short-time objective intelligibility measure [STOI; Taal, Hendriks, Heusdens, and Jensen (2011). IEEE Trans. Audio Speech Lang. Process. 19(7), 2125–2136]. This “hybrid” model, named sEPSMcorr, is shown to account for the effects of stationary and fluctuating additive interferers as well as for the effects of non-linear distortions, such as spectral subtraction, phase jitter, and ideal time frequency segregation (ITFS). The model shows a broader predictive range than both the original mr-sEPSM (which fails in the phase-jitter and ITFS conditions) and STOI (which fails to predict the influence of fluctuating interferers), albeit with lower accuracy than the source models in some individual conditions. Similar to other models that employ a short-term correlation-based back end, including STOI, the proposed model fails to account for the effects of room reverberation on speech intelligibility. Overall, the model might be valuable for evaluating the effects of a large range of interferers and distortions on speech intelligibility, including consequences of hearing impairment and hearing-instrument signal processing.

References Powered by Scopus

View more at Scopus

Cited by Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Relaño-Iborra, H., May, T., Zaar, J., Scheidiger, C., & Dau, T. (2016). Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain. The Journal of the Acoustical Society of America, 140(4), 2670–2679. https://doi.org/10.1121/1.4964505

Readers' Seniority

PhD / Post grad / Masters / Doc 33

70%

Researcher 10

21%

Professor / Associate Prof. 4

Readers' Discipline

Engineering 28

74%

Materials Science 4

11%

Neuroscience 3

Physics and Astronomy 3

Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain

Abstract

References Powered by Scopus

An algorithm for intelligibility prediction of time-frequency weighted noisy speech

Factors Governing the Intelligibility of Speech Sounds

Effects of fluctuating noise and interfering speech on the speech-reception threshold for impaired and normal hearing

Cited by Powered by Scopus

The Hearing-Aid Speech Perception Index (HASPI) Version 2

An instrumental intelligibility metric based on information theory

An Evaluation of Intrusive Instrumental Intelligibility Metrics

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline