In speaker diarization, the most commonly used speaker feature is MFCC, which is also most commonly used speech feature in speech recognition. The newly proposed Power Normalized Cepstrum Coefficients (PNCC) achieves impressive improvement in noisy speech recognition compare to MFCC. It consequently expects a proof for speaker diarization use. In this paper, PNCC is evaluated against MFCC in a meeting domain speaker diarization system. The Diarization Error Rate (DER) shows no positive results with PNCC. This is possibly because of their inhibition in high frequency spectrum which is believed to represents the characteristics of human's voice. An initial model training material select strategy is proposed and used in the speaker diarization system in this work. © 2010 IEEE.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below