Quantization-Aware Training with Dynamic and Static Pruning

Sangho An; Jongyun Shin; Jangho Kim

Journal ArticleOPEN ACCESS

Quantization-Aware Training with Dynamic and Static Pruning

IEEE Access (2025) 13 57476-57484

DOI: 10.1109/ACCESS.2025.3556629

8Citations

9Readers

Abstract

The evolution of deep neural networks (DNNs) naturally leads to an increase in model size. This necessitates various model compression techniques, such as pruning and quantization, to reduce memory usage and power consumption. In particular, combining these compression techniques can achieve significant cost savings. However, we found that methods using dynamic pruning and quantization suffer from instability in training and poor generalization performance due to the effects of the two Straight Through Estimators (STE). To address this problem, we propose a Quantization-aware training with Dynamic and Static pruning (QADS) method that takes advantage of both pruning and quantization by performing STE operations only during quantization from a certain point in time. In our experiments, the proposed method exhibits more stable training compared to existing techniques and achieves performance improvements on the CIFAR-10/100, ImageNet, and Google Speech Command datasets. The code is provided at https://github.com/Ahnho/Quantization-aware-training-with-Dynamic-and-Static-Pruning.

Author supplied keywords

Cite

CITATION STYLE

APA

An, S., Shin, J., & Kim, J. (2025). Quantization-Aware Training with Dynamic and Static Pruning. IEEE Access, 13, 57476–57484. https://doi.org/10.1109/ACCESS.2025.3556629

Quantization-Aware Training with Dynamic and Static Pruning

Abstract

Author supplied keywords

Cite

Register to see more suggestions