Collecting and evaluating speech recognition corpora for nine Southern Bantu languages

3Citations
Citations of this article
90Readers
Mendeley users who have this article in their library.
Get full text

Abstract

We describe the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which includes data from nine Southern Bantu languages. Because of practical constraints, the amount of speech per language is relatively small compared to major corpora in world languages, and we report on our investigation of the stability of the ASR models derived from the corpus. We also report on phoneme distance measures across languages, and describe initial phone recognisers that were developed using this data.

Cite

CITATION STYLE

APA

Badenhorst, J., van Heerden, C., Davel, M., & Barnard, E. (2009). Collecting and evaluating speech recognition corpora for nine Southern Bantu languages. In EACL 2009 - Proceedings of the EACL 2009 Workshop on Language Technologies for African Languages, AfLaT 2009 (pp. 1–8). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1564508.1564510

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free