Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models

3Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

Abstract

Quantization is a promising approach for reducing memory overhead and accelerating inference, especially in large pre-trained language model (PLM) scenarios. While having no access to original training data due to security and privacy concerns has emerged the demand for zero-shot quantization. Most of the cutting-edge zero-shot quantization methods primarily apply to computer vision tasks, and neglect of overfitting problem in the generative adversarial learning process, leading to sub-optimal performance. Motivated by this, we propose a novel zero-shot sharpness-aware quantization (ZSAQ) framework for the zero-shot quantization of various PLMs. The key algorithm in solving ZSAQ is the SAM-SGA optimization, which aims to improve the quantization accuracy and model generalization via optimizing a minimax problem. We theoretically prove the convergence rate for the minimax optimization problem and this result can be applied to other nonconvex-PL minimax optimization frameworks. Extensive experiments on 11 tasks demonstrate that our method brings consistent and significant performance gains on both discriminative and generative PLMs, i.e., up to +6.98 average score. Furthermore, we empirically validate that our method can effectively improve the model generalization.

Cite

CITATION STYLE

APA

Zhu, M., Zhong, Q., Shen, L., Ding, L., Liu, J., Du, B., & Tao, D. (2023). Zero-Shot Sharpness-Aware Quantization for Pre-trained Language Models. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 11305–11327). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-main.696

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free