Abstract
The exceptional performance of general-purpose large models has driven various industries to focus on developing domain-specific models. However, large models are not only time-consuming and labor-intensive during the training phase but also have very high hardware requirements during the inference phase, such as large memory and high computational power. These requirements pose considerable challenges for the practical deployment of large models. As these challenges intensify, model compression has become a vital research focus to address these limitations. This paper presents a comprehensive review of the evolution of model compression techniques, from their inception to future directions. To meet the urgent demand for efficient deployment, we delve into several compression methods—such as quantization, pruning, low-rank decomposition, and knowledge distillation—emphasizing their fundamental principles, recent advancements, and innovative strategies. By offering insights into the latest developments and their implications for practical applications, this review serves as a valuable technical resource for researchers and practitioners, providing a range of strategies for model deployment and laying the groundwork for future advancements in model compression.
Author supplied keywords
Cite
CITATION STYLE
Liu, D., Zhu, Y., Liu, Z., Liu, Y., Han, C., Tian, J., … Yi, W. (2025). A survey of model compression techniques: past, present, and future. Frontiers in Robotics and AI. Frontiers Media SA. https://doi.org/10.3389/frobt.2025.1518965
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.