Noise-robust text-dependent speaker identification using cochlear models

  • Islam M
  • Xu Y
  • Monk T
  • et al.
7Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.

Abstract

One challenging issue in speaker identification (SID) is to achieve noise-robust performance. Humans can accurately identify speakers, even in noisy environments. We can leverage our knowledge of the function and anatomy of the human auditory pathway to design SID systems that achieve better noise-robust performance than conventional approaches. We propose a text-dependent SID system based on a real-time cochlear model called cascade of asym- metric resonators with fast-acting compression (CARFAC). We investigate the SID performance of CARFAC on signals corrupted by noise of various types and levels. We compare its performance with conventional auditory fea- ture generators including mel-frequency cepstrum coefficients, frequency domain linear predictions, as well as another biologically inspired model called the auditory nerve model. We show that CARFAC outperforms other approaches when signals are corrupted by noise. Our results are consistent across datasets, types and levels of noise, different speaking speeds, and back-end classifiers. We show that the noise-robust SID performance of CARFAC is largely due to its nonlinear processing of auditory input signals. Presumably, the human auditory system achieves noise-robust performance via inherent nonlinearities as well. VC

Cite

CITATION STYLE

APA

Islam, Md. A., Xu, Y., Monk, T., Afshar, S., & van Schaik, A. (2022). Noise-robust text-dependent speaker identification using cochlear models. The Journal of the Acoustical Society of America, 151(1), 500–516. https://doi.org/10.1121/10.0009314

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free