Vietnamese automatic speech recognition: The FLaVoR approach

Quan Vu; Kris Demuynck; Dirk Van Compernolle

Conference Proceedings

Vietnamese automatic speech recognition: The FLaVoR approach

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2006) 4274 LNAI 464-474

DOI: 10.1007/11939993_49

9Citations

4Readers

Get full text

Abstract

Automatic speech recognition for languages in Southeast Asia, including Chinese, Thai and Vietnamese, typically models both acoustics and languages at the syllable level. This paper presents a new approach for recognizing those languages by exploiting information at the word level. The new approach, adapted from our FLaVoR architecture[1], consists of two layers. In the first layer, a pure acoustic-phonemic search generates a dense phoneme network enriched with meta data. In the second layer, a word decoding is performed in the composition of a series of finite state transducers (FST), combining various knowledge sources across sub-lexical, word lexical and word-based language models. Experimental results on the Vietnamese Broadcast News corpus showed that our approach is both effective and flexible. © 2006 Springer-Verlag Berlin/Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Vu, Q., Demuynck, K., & Van Compernolle, D. (2006). Vietnamese automatic speech recognition: The FLaVoR approach. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4274 LNAI, pp. 464–474). https://doi.org/10.1007/11939993_49

Vietnamese automatic speech recognition: The FLaVoR approach

Abstract

Author supplied keywords

Cite

Register to see more suggestions