Adjusting occurrence probabilities of automatically-generated abbreviated words in spoken dialogue systems

2Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Users often abbreviate long words when using spoken dialogue systems, which results in automatic speech recognition (ASR) errors. We define abbreviated words as sub-words of an original word and add them to the ASR dictionary. The first problem we face is that proper nouns cannot be correctly segmented by general morphological analyzers, although long and compound words need to be segmented in agglutinative languages such as Japanese. The second is that, as vocabulary size increases, adding many abbreviated words degrades the ASR accuracy. We have developed two methods, (1) to segment words by using conjunction probabilities between characters, and (2) to adjust occurrence probabilities of generated abbreviated words on the basis of the following two cues: phonological similarities between the abbreviated and original words and frequencies of abbreviated words in Web documents. Our method improves ASR accuracy by 34.9 points for utterances containing abbreviated words without degrading the accuracy for utterances containing original words. © 2009 Springer Berlin Heidelberg Spoken dialogue systems*abbreviated words*adjusting occurrence probabilities.

Cite

CITATION STYLE

APA

Katsumaru, M., Komatani, K., Ogata, T., & Okuno, H. G. (2009). Adjusting occurrence probabilities of automatically-generated abbreviated words in spoken dialogue systems. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5579 LNAI, pp. 481–490). https://doi.org/10.1007/978-3-642-02568-6_49

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free