RedApt: An Adaptor for WAV2VEC 2 Encoding Faster and Smaller Speech Translation without Quality Compromise

2Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.

Abstract

Pre-trained speech Transformers in speech translation (ST) have facilitated state-of-the-art (SotA) results; yet, using such encoders is computationally expensive. To improve this, we present a novel Reducer Adaptor block, RedApt, that could be seamlessly integrated within any Transformer-based speech encoding architecture. Integrating the pretrained WAV2VEC 2 speech encoder with RedApt brings 41% speedup, 33% memory reduction with 24% fewer FLOPs at inference. To our positive surprise, our ST model with RedApt outperforms the SotA architecture by an average of 0.68 BLEU score on 8 language pairs from Must-C.

Cite

CITATION STYLE

APA

Zhao, J., Yang, H., Haffari, G., & Shareghi, E. (2022). RedApt: An Adaptor for WAV2VEC 2 Encoding Faster and Smaller Speech Translation without Quality Compromise. In Findings of the Association for Computational Linguistics: EMNLP 2022 (pp. 1960–1967). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.findings-emnlp.142

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free