This paper presents our work in the participation of IWSLT 2022 simultaneous speech translation evaluation. For the track of text-to-text (T2T), we participate in three language pairs and build wait-k based simultaneous MT (SimulMT) model for the task. The model was pretrained on WMT21 news corpora, and was further improved with in-domain fine-tuning and self-training. For the speech-to-text (S2T) track, we designed both cascade and end-to-end form in three language pairs. The cascade system is composed of a chunking-based streaming ASR model and the SimulMT model used in the T2T track. The end-to-end system is a simultaneous speech translation (SimulST) model based on wait-k strategy, which is directly trained on a synthetic corpus produced by translating all texts of ASR corpora into specific target language with an offline MT model. It also contains a heuristic sentence breaking strategy, preventing it from finishing the translation before the the end of the speech. We evaluate our systems on the MUST-C tst-COMMON dataset and show that the end-to-end system is competitive to the cascade one. Meanwhile, we also demonstrate that the SimulMT model can be efficiently optimized by these approaches, resulting in the improvements of 1-2 BLEU points.
CITATION STYLE
Wang, M., Guo, J., Li, Y., Qiao, X., Wang, Y., Li, Z., … Qin, Y. (2022). The HW-TSC’s Simultaneous Speech Translation System for IWSLT 2022 Evaluation. In IWSLT 2022 - 19th International Conference on Spoken Language Translation, Proceedings of the Conference (pp. 247–254). Association for Computational Linguistics (ACL).
Mendeley helps you to discover research relevant for your work.