This paper proposes novel features based on linear prediction of temporal phase (LPTP) for speaker recognition task. The proposed LPTC feature vector represents Discrete Cosine Transform (DCT) (for energy compaction and decorrelation) coefficients of LP spectrum derived from temporal phase of speech signal. The results are shown on standard NIST 2002 SRE and GMM-UBM (Gaussian Mixture Modeling-Universal Background Modeling) approach. A recently proposed supervised score-level fusion method is used for combining evidences of Mel Frequency Cepstral Coefficients (MFCC) and proposed feature set. Performance of proposed feature set is compared with state-of-the-art MFCC features. It is evident from the results that proposed features gives 4% improvement in % identification rate and 2% decrement in % EER than that of standard MFCC alone. In addition, when the supervised score-level fusion is used, identification rate improves 8% and EER is decreased by 2% indicating that proposed feature captures complimentary information than MFCC alone.
CITATION STYLE
Gandhi, A., & Patil, H. A. (2017). Novel linear prediction temporal phase based features for speaker recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10458 LNAI, pp. 564–571). Springer Verlag. https://doi.org/10.1007/978-3-319-66429-3_56
Mendeley helps you to discover research relevant for your work.