Lexicon optimization for wfst-based speech recognition using acoustic distance based confusability measure and G2P conversion

Nam Kyun Kim; Woo Kyeong Seong; Hong Kook Kim

Book Chapter

Lexicon optimization for wfst-based speech recognition using acoustic distance based confusability measure and G2P conversion

Springer International Publishing, (2015), 119-217

DOI: 10.1007/978-3-319-19291-8_12

0Citations

2Readers

Get full text

Abstract

In this paper, we propose a lexicon optimization method based on a confusability measure (CM) to develop a large vocabulary continuous speech recognition (LVCSR) system with unseen words. When a lexicon is built or expanded for unseen words by using grapheme-to-phoneme (G2P) conversion, the lexicon size increases because G2P is generally realized by 1-to-N-best mapping. Thus, the proposed method attempts to prune the confusable words in the lexicon by a CM defined as the acoustic model distance between two phonemic sequences. It is demonstrated through the LVCSR experiments that the proposed lexicon optimization method achieves a relative word error rate (WER) reduction of 14.72% in a Wall Street Journal task compared to the 1-to-4-best G2P converted lexicon approach.

Author supplied keywords

Cite

CITATION STYLE

APA

Kim, N. K., Seong, W. K., & Kim, H. K. (2015). Lexicon optimization for wfst-based speech recognition using acoustic distance based confusability measure and G2P conversion. In Natural Language Dialog Systems and Intelligent Assistants (pp. 119–217). Springer International Publishing. https://doi.org/10.1007/978-3-319-19291-8_12

Lexicon optimization for wfst-based speech recognition using acoustic distance based confusability measure and G2P conversion

Abstract

Author supplied keywords

Cite

Register to see more suggestions