End-to-End Simultaneous Speech Translation with Pretraining and Distillation: Huawei Noah’s System for AutoSimTranS 2022

Xingshan Zeng; Pengfei Li; Liangyou Li; Qun Liu

Conference ProceedingsOPEN ACCESS

End-to-End Simultaneous Speech Translation with Pretraining and Distillation: Huawei Noah’s System for AutoSimTranS 2022

AutoSimTrans 2022 - Automatic Simultaneous Translation Challenges, Recent Advances, and Future Directions, Proceedings of the 3rd Workshop (2022) 25-33

DOI: 10.18653/v1/2022.autosimtrans-1.5

2Citations

25Readers

Abstract

This paper describes the system submitted to AutoSimTrans 2022 from Huawei Noah’s Ark Lab, which won the first place in the audio input track of the Chinese-English translation task. Our system is based on RealTranS, an end-to-end simultaneous speech translation model. We enhance the model with pretraining, by initializing the acoustic encoder with ASR encoder, and the semantic encoder and decoder with NMT encoder and decoder, respectively. To relieve the data scarcity, we further construct pseudo training corpus as a kind of knowledge distillation with ASR data and the pretrained NMT model. Meanwhile, we also apply several techniques to improve the robustness and domain generalizability, including punctuation removal, token-level knowledge distillation and multi-domain finetuning. Experiments show that our system significantly outperforms the baselines at all latency and also verify the effectiveness of our proposed methods.

Cite

CITATION STYLE

APA

Zeng, X., Li, P., Li, L., & Liu, Q. (2022). End-to-End Simultaneous Speech Translation with Pretraining and Distillation: Huawei Noah’s System for AutoSimTranS 2022. In AutoSimTrans 2022 - Automatic Simultaneous Translation Challenges, Recent Advances, and Future Directions, Proceedings of the 3rd Workshop (pp. 25–33). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.autosimtrans-1.5

End-to-End Simultaneous Speech Translation with Pretraining and Distillation: Huawei Noah’s System for AutoSimTranS 2022

Abstract

Cite

Register to see more suggestions