We introduce a new technique for using the speech of multiple reference speakers as a basis for speaker adaptation in large vocabulary continuous speech recognition. In contrast to other methods that use a pooled reference model, this technique normalizes the training speech from multiple reference speakers to a single common feature space before pooling it. The normalized and pooled speech can then be treated as if it came from a single reference speaker for training the reference hidden Markov model (HMM). Our usual prohabilistic spectrum transformation can be applied to the reference HMM to model a new (target) speaker. In this paper, we describe our baseline (single reference speaker) speakeradaptation system and give current performance results from a recent formal evaluation of the system. We also describe our proposal for adapting from multiple reference speakers and report on recent preliminary experimental results in support of the proposed technique.
CITATION STYLE
Kubala, F., Schwartz, R., & Barry, C. (1989). Speaker Adaptation Using Multiple Reference Speakers. In Speech and Natural Language, Proceedings of a Workshop (pp. 256–262). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1075434.1075476
Mendeley helps you to discover research relevant for your work.