State-of-the-art deep neural networks (DNNs) have greatly improved the performance of facial landmarks detection. However, DNN models usually have a large number of parameters, which leads to high computational complexity and memory cost. To address this problem, we propose a method to compress large deep neural networks, which includes three steps. (1) Importance-based neuron pruning: compared with traditional connection pruning, we introduce weights correlations to prune unimportant neurons, which can reduce index storage and inference computation costs. (2) Product quantization: further use of product quantization helps to enforce weights sharing, which stores fewer cluster indexes and codebooks than scalar quantization. (3) Network retraining: to reduce training difficulty and performance degradation, we iteratively retrain the network, compressing one layer at a time. Experiments of compressing a VGG-like model for facial landmarks detection demonstrate that the proposed method achieves 26x compression of the model with 1.5% performance degradation.
Zeng, D., Zhao, F., & Bao, Y. (2016). Compressing deep neural network for facial landmarks detection. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10023 LNAI, pp. 102–112). Springer Verlag. https://doi.org/10.1007/978-3-319-49685-6_10