This work focuses on generating children’s HMM-based acoustic models for speech recognition from adult acoustic models. Collecting children’s speech data is more costly compared to adult’s speech. The patent-pending method developed in this work requires only adult data to estimate synthetic children’s acoustic models in any language and works as follows: For a new language where only adult data is available, an adult male and an adult female model is trained. A linear transformation from each male HMM mean vector to its closest female mean vector is estimated. This transform is then scaled to a certain power and applied to the female model to obtain a synthetic children’s model. In a pronunciation verification task the method yields 19% and 3.7% relative improvement on native English and Spanish children’s data, respectively, compared to the best adult model. For Spanish data, the new model outperforms the available real children’s data based model by 13% relative.
CITATION STYLE
Hagen, A., Pellom, B., & Hacioglu, K. (2009). Generating synthetic children’s acoustic models from adult models. In NAACL-HLT 2009 - Human Language Technologies: 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Short Papers (pp. 77–80). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1620853.1620877
Mendeley helps you to discover research relevant for your work.