An intra-modal fusion, a fusion of different features of the same modal is proposed for speaker identification system. Two fusion methods at feature level and at decision level for multiple features are proposed in this study. We used multiple features from MFCC and wavelet transform of speech signal. Wavelet transform based features capture frequency variation across time while MFCC features mainly approximate the base frequency information, and both are important. A final score is calculated using weighted sum rule by taking matching results of different features. We evaluate the proposed fusion strategies on VoxForge speech dataset using K-Nearest Neighbor classifier. We got the promising result with multiple features in compare to separate one. Further, multi-features also performed well at different SNRs on NOIZEUS, a noisy speech corpus. © 2011 Springer-Verlag.
CITATION STYLE
Verma, G. K. (2011). Multi-feature fusion for closed set text independent speaker identification. In Communications in Computer and Information Science (Vol. 141 CCIS, pp. 170–179). https://doi.org/10.1007/978-3-642-19423-8_18
Mendeley helps you to discover research relevant for your work.