Automated Quantization and Retraining for Neural Network Models Without Labeled Data

2Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Deploying neural network models to edge devices is becoming increasingly popular because such deployment decreases the response time and ensures better data privacy of services. However, running large models on edge devices poses challenges because of limited computing resources and storage space. Researchers have therefore proposed various model compression methods to reduce the model size. To balance the trade-off between model size and accuracy, conventional model compression methods require manual effort to find the optimal configuration that reduces the model size without significant degradation of accuracy. In this article, we propose a method to automatically find the optimal configurations for quantization. The proposed method suggests multiple compression configurations that produce models with different size and accuracy, from which users can select the configurations that suit their use cases. Additionally, we propose a retraining method that does not require any labeled datasets for retraining. We evaluated the proposed method using various neural network models for classification, regression and semantic similarity tasks, and demonstrated that the proposed method reduced the size of models by at least 30% while maintaining less than 1% loss of accuracy. We compared the proposed method with state-of-the-art automated compression methods, and showed that it can provide better compression configurations than existing methods.

Cite

CITATION STYLE

APA

Thonglek, K., Takahashi, K., Ichikawa, K., Nakasan, C., Nakada, H., Takano, R., … Iida, H. (2022). Automated Quantization and Retraining for Neural Network Models Without Labeled Data. IEEE Access, 10, 73818–73834. https://doi.org/10.1109/ACCESS.2022.3190627

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free