A supervised phrase selection strategy for phonetically balanced standard Yorùbá Corpus

1Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents a scheme for the development of speech corpus for Standard Yorùbá (SY). The problem herein is the non-availability of phonetically balanced corpus in most resource-scarce languages such as SY. The proposed solution herein is hinged on the development and implementation of a supervised phrase selection using Rule-Based Corpus Optimization Model (RBCOM) to obtain phonetically balanced SY corpus. This was in turn compared with the random phrase selection procedure. The concept of Exploitative Data Analysis (EDA), which is premised on frequency distribution models, was further deployed to evaluate the distribution of allophones of selected phrases. The goodness of fit of the frequency distributions was studied using: Kolmogorov Smirnov, Andersen Darling and Chi-Squared tests while comparative studies were respectively carried out among other techniques. The sample skewness result was used to establish the normality behavior of the data. The results obtained confirmed the efficacy of the supervised phrase selection against the random phrase selection.

Cite

CITATION STYLE

APA

Sosimi, A., Adegbola, T., & Fakinlede, O. (2015). A supervised phrase selection strategy for phonetically balanced standard Yorùbá Corpus. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9042, pp. 565–582). Springer Verlag. https://doi.org/10.1007/978-3-319-18117-2_42

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free