Lexicon optimization for wfst-based speech recognition using acoustic distance based confusability measure and G2P conversion

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this paper, we propose a lexicon optimization method based on a confusability measure (CM) to develop a large vocabulary continuous speech recognition (LVCSR) system with unseen words. When a lexicon is built or expanded for unseen words by using grapheme-to-phoneme (G2P) conversion, the lexicon size increases because G2P is generally realized by 1-to-N-best mapping. Thus, the proposed method attempts to prune the confusable words in the lexicon by a CM defined as the acoustic model distance between two phonemic sequences. It is demonstrated through the LVCSR experiments that the proposed lexicon optimization method achieves a relative word error rate (WER) reduction of 14.72% in a Wall Street Journal task compared to the 1-to-4-best G2P converted lexicon approach.

Cite

CITATION STYLE

APA

Kim, N. K., Seong, W. K., & Kim, H. K. (2015). Lexicon optimization for wfst-based speech recognition using acoustic distance based confusability measure and G2P conversion. In Natural Language Dialog Systems and Intelligent Assistants (pp. 119–217). Springer International Publishing. https://doi.org/10.1007/978-3-319-19291-8_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free