Vietnamese automatic speech recognition: The FLaVoR approach

9Citations
Citations of this article
4Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Automatic speech recognition for languages in Southeast Asia, including Chinese, Thai and Vietnamese, typically models both acoustics and languages at the syllable level. This paper presents a new approach for recognizing those languages by exploiting information at the word level. The new approach, adapted from our FLaVoR architecture[1], consists of two layers. In the first layer, a pure acoustic-phonemic search generates a dense phoneme network enriched with meta data. In the second layer, a word decoding is performed in the composition of a series of finite state transducers (FST), combining various knowledge sources across sub-lexical, word lexical and word-based language models. Experimental results on the Vietnamese Broadcast News corpus showed that our approach is both effective and flexible. © 2006 Springer-Verlag Berlin/Heidelberg.

Cite

CITATION STYLE

APA

Vu, Q., Demuynck, K., & Van Compernolle, D. (2006). Vietnamese automatic speech recognition: The FLaVoR approach. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4274 LNAI, pp. 464–474). https://doi.org/10.1007/11939993_49

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free