Beyond sentence-level end-to-end speech translation: Context helps

20Citations
Citations of this article
75Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Document-level contextual information has shown benefits to text-based machine translation, but whether and how context helps end-to-end (E2E) speech translation (ST) is still under-studied. We fill this gap through extensive experiments using a simple concatenation-based context-aware ST model, paired with adaptive feature selection on speech encodings for computational efficiency. We investigate several decoding approaches, and introduce in-model ensemble decoding which jointly performs document- and sentence-level translation using the same model. Our results on the MuST-C benchmark with Transformer demonstrate the effectiveness of context to E2E ST. Compared to sentence-level ST, context-aware ST obtains better translation quality (+0.18-2.61 BLEU), improves pronoun and homophone translation, shows better robustness to (artificial) audio segmentation errors, and reduces latency and flicker to deliver higher quality for simultaneous translation.

Cite

CITATION STYLE

APA

Zhang, B., Titov, I., Haddow, B., & Sennrich, R. (2021). Beyond sentence-level end-to-end speech translation: Context helps. In ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference (Vol. 1, pp. 2566–2578). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.acl-long.200

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free