EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation

8Citations
Citations of this article
39Readers
Mendeley users who have this article in their library.

Abstract

We introduce EDGEFORMER - a parameter-efficient Transformer for on-device seq2seq generation under the strict computation and memory constraints. Compared with the previous parameter-efficient Transformers, EDGEFORMER applies two novel principles for cost-effective parameterization, allowing it to perform better given the same parameter budget; moreover, EDGEFORMER is further enhanced by layer adaptation innovation that is proposed for improving the network with shared layers. Extensive experiments show EDGEFORMER can effectively outperform previous parameter-efficient Transformer baselines and achieve competitive results under both the computation and memory constraints. Given the promising results, we release EDGELM - the pretrained version of EDGEFORMER, which is the first publicly available pretrained on-device seq2seq model that can be easily fine-tuned for seq2seq tasks with strong results, facilitating on-device seq2seq generation in practice.

Cite

CITATION STYLE

APA

Ge, T., Chen, S. Q., & Wei, F. (2022). EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 10786–10798). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.emnlp-main.741

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free