Baseline WSJ acoustic models for HTK and Sphinx: Training recipes and recognition experiments

  • Vertanen K
N/ACitations
Citations of this article
53Readers
Mendeley users who have this article in their library.

Abstract

For speech recognition research, it is often necessary to start with a competent baseline acoustic model. But training and tuning a competent model using research recognizers such as Cambridge’sHTKandCMU’s Sphinx can be time-consuming. In an effort to minimize wasted effort, I have created recipes for HTK and Sphinx which utilize the standard Wall Street Journal training corpus. In this paper, these recipes are de- scribed. The word error rate (WER) and real-time perfor- mance of the models are evaluated for differingHMMtopolo- gies, number of tied states, number of Gaussians, and differ- ing test sets. Mygoal is to provide practical advice and results to researchers who are thinking of using HTK or Sphinx for real-time recognition on dictation-like tasks.

Cite

CITATION STYLE

APA

Vertanen, K. (2006). Baseline WSJ acoustic models for HTK and Sphinx: Training recipes and recognition experiments. Cavendish Laboratory, University of Cambridge. Retrieved from http://medcontent.metapress.com/index/A65RM03P4874243N.pdf

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free