Word discovering in low-resources languages through cross-lingual phonemes

0Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

An approach for discovering word units in an unknown language under zero resources conditions is presented in this paper. The method is based only on acoustic similarity, combining a cross-lingual phoneme recognition, followed by an identification of consistent strings of phonemes. To this end, a 2-phases algorithm is proposed. The first phase consists of an acoustic-phonetic decoding process, considering a universal set of phonemes, not related with the target language. The goal is to reduce the search space of similar segments of speech, avoiding the quadratic search space if all-to-all speech files are compared. In the second phase, a further refinement of the founded segments is done by means of different approaches based on Dynamic Time Warping. In order to include more hypotheses than only those that correspond to perfect matching in terms of phonemes, an edit distance is calculated for the purpose to also incorporate hypotheses under a given threshold. Three frame representations are studied: raw acoustic features, autoencoders and phoneme posteriorgrams. This approach has been evaluated on the corpus used in Zero resources speech challenge 2017.

Cite

CITATION STYLE

APA

García-Granada, F., Sanchis, E., Castro-Bleda, M. J., González, J. Á., & Hurtado, L. F. (2019). Word discovering in low-resources languages through cross-lingual phonemes. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11658 LNAI, pp. 133–141). Springer Verlag. https://doi.org/10.1007/978-3-030-26061-3_14

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free