An Investigation on Different Underlying Quantization Schemes for Pre-trained Language Models

Zihan Zhao; Yuncong Liu; Lu Chen; Qi Liu; Rao Ma; Kai Yu

Conference Proceedings

An Investigation on Different Underlying Quantization Schemes for Pre-trained Language Models

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2020) 12430 LNAI 359-371

DOI: 10.1007/978-3-030-60450-9_29

6Citations

8Readers

Get full text

Abstract

Recently, pre-trained language models like BERT have shown promising performance on multiple natural language processing tasks. However, the application of these models has been limited due to their huge size. To reduce its size, a popular and efficient way is quantization. Nevertheless, most of the works focusing on BERT quantization adapted primary linear clustering as the quantization scheme, and few works try to upgrade it. That limits the performance of quantization significantly. In this paper, we implement k-means quantization and compare its performance on the fix-precision quantization of BERT with linear quantization. Through the comparison, we verify that the effect of the underlying quantization scheme upgrading is underestimated and there is a huge development potential of k-means quantization. Besides, we also compare the two quantization schemes on ALBERT models to explore the robustness differences between different pre-trained models.

Author supplied keywords

Cite

CITATION STYLE

APA

Zhao, Z., Liu, Y., Chen, L., Liu, Q., Ma, R., & Yu, K. (2020). An Investigation on Different Underlying Quantization Schemes for Pre-trained Language Models. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12430 LNAI, pp. 359–371). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-60450-9_29

An Investigation on Different Underlying Quantization Schemes for Pre-trained Language Models

Abstract

Author supplied keywords

Cite

Register to see more suggestions