Convolutional neural networks (CNN) have achieved excellent results in the field of image recognition that classifies objects in images. A typical CNN consists of a deep architecture that uses a large number of weights and layers to achieve high performance. CNN requires relatively large memory space and computational costs, which not only increase the time to train the model but also limit the real-time application of the trained model. For this reason, various neural network compression methodologies have been studied to effciently use CNN in small embedded hardware such as mobile and edge devices. In this paper, we propose a kernel density estimation based non-uniform quantization methodology that can perform compression effciently. The proposed method performs effcient weights quantization using a significantly smaller number of sampled weights than the number of original weights. Four-bit quantization experiments on the classification of the ImageNet dataset with various CNN architectures show that the proposed methodology can perform weights quantization effciently in terms of computational costs without significant reduction in model performance.
CITATION STYLE
Seo, S., & Kim, J. (2019). Effcient weights quantization of convolutional neural networks using kernel density estimation based non-uniform quantizer. Applied Sciences (Switzerland), 9(12). https://doi.org/10.3390/app9122559
Mendeley helps you to discover research relevant for your work.