Phonetic analysis of speech, in general, requires the alignment of audio samples to its phonetic transcription. This task could be performed manually for a couple of files, but as the corpus grows large it becomes unfeasibly time-consuming, which emphasizes the need for computational tools that perform such speech-phonemes forced alignment automatically. Therefore, due to the scarce availability of phonetic alignment tools for Brazilian Portuguese (BP), this work describes the evolution process towards creating a free phonetic alignment tool for BP using Kaldi, a toolkit that has been the state of the art for open-source speech recognition. Five acoustic models were trained with Kaldi and tested in phonetic alignment, where the evaluation took place in terms of the phone boundary metric. The results show that its performance is similar to some Kaldi-based aligners for other languages, and superior to an outdated phonetic aligner for BP based on HTK toolkit.
CITATION STYLE
Dias, A. L., Batista, C., Santana, D., & Neto, N. (2020). Towards a Free, Forced Phonetic Aligner for Brazilian Portuguese Using Kaldi Tools. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12319 LNAI, pp. 621–635). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-61377-8_44
Mendeley helps you to discover research relevant for your work.