Mandarin, also known as Standard Chinese is the official language of China and Singapore, there are certain differences when mandarin is spoken by people from different homeplaces. The homeplace classification is important in speech recognition and machine translation. In this paper, we proposed a novel model named Bag-of-phonemes (BOP) for homeplace classification of mandarin speakers, which follows the conceptually similar idea of the Bag-of-words (BOW) model in text processing. The low-level Mel-frequency cepstral coefficients (MFCC) speach features of each homeplace are clustered into a set of codewords referred to as phonemes. With this codebook, each speech signal can be represented by a feature vector of distribution on phonemes. Classical classifiers such as support vector machine (SVM) can be applied for classification. This model is tested by RASC863 database, empirical studies show that the new model has a better performance on the RASC863 database comparing to previous works [1].
CITATION STYLE
Zhao, H., Qin, Z., Wang, Y., & Wang, Y. (2015). A bag-of-phonemes model for homeplace classification of mandarin speakers. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9117, pp. 683–690). Springer Verlag. https://doi.org/10.1007/978-3-319-19390-8_76
Mendeley helps you to discover research relevant for your work.