Abstract
In order to solve the problem of real-time speech quality assessment, a high-performance algorithm, Efficient Psychoacoustics Evaluation of Speech Quality (EPESQ), based on psychoacoustics model, is proposed. The process of EPESQ is: The original and corresponding degraded speech samples are first preprocessed by overall gain compensation and 1RS (Intermediate Reference System) filtering. Then both signals are transformed to their loudness presentation by a series of consecutive steps: windowed fast Fourier transform, frequency warping to Mel-scale and loudness mapping. The loudness presentations are compared in different time-frequency cell to get the differences called Disturbance. Disturbances are aggregated over time and frequency and then the result is processed by a cognitive formula to generate the final evaluation score. Experimental results show that EPESQ performs a 37.5% reduction in running time and 51.9% in memory occupation to the P.862 algorithm with only a 7.8% decrease in average correlation to listener opinions. EPESQ is a high-performance algorithm and suitable for real-time applications. It has been implemented in our Internet voice communication system as a self-evaluating component. © 2006 Asian Network for Scientific Information.
Author supplied keywords
Cite
CITATION STYLE
Zhang, J., Gao, L., & Zhang, D. (2006). A high-performance psychoacoustics approach to speech quality evaluation. Information Technology Journal, 5(3), 485–488. https://doi.org/10.3923/itj.2006.485.488
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.