Filled pauses are characteristic of spontaneous speech and can present considerable problems for speech recognition by being often recognized as short words. An um can be recognized as thumb or arm if the recognizer's language model does not adequately represent FP's. Recognition of quasi-spontaneous speech (medical dictation) is subject to this problem as well. Results from medical dictations by 21 family practice physicians show that using an FP model trained on the corpus populated with FP's produces overall better results than a model trained on a corpus that excluded FP's or a corpus that had random FP's.
CITATION STYLE
Pakhomov, S. V. (1999). Modeling filled pauses in medical dictations. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 1999-June, pp. 619–624). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1034678.1034692
Mendeley helps you to discover research relevant for your work.