Abstract
This paper presents JHU’s submissions to the IWSLT 2023 dialectal and low-resource track of Tunisian Arabic to English speech translation. The Tunisian dialect lacks formal orthography and abundant training data, making it challenging to develop effective speech translation (ST) systems. To address these challenges, we explore the integration of large pretrained machine translation (MT) models, such as mBART and NLLB-200 in both end-to-end (E2E) and cascaded speech translation (ST) systems. We also improve the performance of automatic speech recognition (ASR) through the use of pseudo-labeling data augmentation and channel matching on telephone data. Finally, we combine our E2E and cascaded ST systems with Minimum Bayes-Risk decoding. Our combined system achieves a BLEU score of 21.6 and 19.1 on test2 and test3, respectively.
Cite
CITATION STYLE
Hussein, A., Xiao, C., Verma, N., Thebaud, T., Wiesner, M., & Khudanpur, S. (2023). JHU IWSLT 2023 Dialect Speech Translation System Description. In 20th International Conference on Spoken Language Translation, IWSLT 2023 - Proceedings of the Conference (pp. 283–290). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.iwslt-1.29
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.