Temporally localized distortions account for the highest variance in subjective evaluation of coded speech signals (Sen (2001) and Hall (2001)). The ability to discern and decompose perceptually relevant temporally localized coding noise from other types of distortions is both of theoretical importance as well as a valuable tool for deploying and designing speech synthesis systems. The work described within uses a physiologically motivated cochlear model to provide a tractable analysis of salient feature trajectories as processed by the cochlea. Subsequent statistical analysis shows simple relationships between the jitter of these trajectories and temporal attributes of the Diagnostic Acceptability Measure (DAM). Copyright © 2009 W. Lu and D. Sen.
CITATION STYLE
Sen, D., & Lu, W. (2009). Analysis of salient feature jitter in the cochlea for objective prediction of temporally localized distortion in synthesized speech. Eurasip Journal on Audio, Speech, and Music Processing, 2009. https://doi.org/10.1155/2009/865723
Mendeley helps you to discover research relevant for your work.