JHU IWSLT 2023 Dialect Speech Translation System Description

Amir Hussein; Cihan Xiao; Neha Verma; Thomas Thebaud; Matthew Wiesner; Sanjeev Khudanpur

Conference Proceedings

JHU IWSLT 2023 Dialect Speech Translation System Description

20th International Conference on Spoken Language Translation, IWSLT 2023 - Proceedings of the Conference (2023) 283-290

DOI: 10.18653/v1/2022.iwslt-1.29

6Citations

38Readers

Get full text

Abstract

This paper presents JHU’s submissions to the IWSLT 2023 dialectal and low-resource track of Tunisian Arabic to English speech translation. The Tunisian dialect lacks formal orthography and abundant training data, making it challenging to develop effective speech translation (ST) systems. To address these challenges, we explore the integration of large pretrained machine translation (MT) models, such as mBART and NLLB-200 in both end-to-end (E2E) and cascaded speech translation (ST) systems. We also improve the performance of automatic speech recognition (ASR) through the use of pseudo-labeling data augmentation and channel matching on telephone data. Finally, we combine our E2E and cascaded ST systems with Minimum Bayes-Risk decoding. Our combined system achieves a BLEU score of 21.6 and 19.1 on test2 and test3, respectively.

Cite

CITATION STYLE

APA

Hussein, A., Xiao, C., Verma, N., Thebaud, T., Wiesner, M., & Khudanpur, S. (2023). JHU IWSLT 2023 Dialect Speech Translation System Description. In 20th International Conference on Spoken Language Translation, IWSLT 2023 - Proceedings of the Conference (pp. 283–290). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.iwslt-1.29

JHU IWSLT 2023 Dialect Speech Translation System Description

Abstract

Cite

Register to see more suggestions