JHU IWSLT 2023 Dialect Speech Translation System Description

6Citations
Citations of this article
38Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents JHU’s submissions to the IWSLT 2023 dialectal and low-resource track of Tunisian Arabic to English speech translation. The Tunisian dialect lacks formal orthography and abundant training data, making it challenging to develop effective speech translation (ST) systems. To address these challenges, we explore the integration of large pretrained machine translation (MT) models, such as mBART and NLLB-200 in both end-to-end (E2E) and cascaded speech translation (ST) systems. We also improve the performance of automatic speech recognition (ASR) through the use of pseudo-labeling data augmentation and channel matching on telephone data. Finally, we combine our E2E and cascaded ST systems with Minimum Bayes-Risk decoding. Our combined system achieves a BLEU score of 21.6 and 19.1 on test2 and test3, respectively.

Cite

CITATION STYLE

APA

Hussein, A., Xiao, C., Verma, N., Thebaud, T., Wiesner, M., & Khudanpur, S. (2023). JHU IWSLT 2023 Dialect Speech Translation System Description. In 20th International Conference on Spoken Language Translation, IWSLT 2023 - Proceedings of the Conference (pp. 283–290). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.iwslt-1.29

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free