The continued increase in memory, runtime and energy consumption of deployed machine learning models on one side, and the trend to miniaturize intelligent devices and sensors on the other side, imply that model compression will remain a critical need for the foreseeable future. A scalable solution to this problem must be able to handle arbitrary choices of the reference model to be compressed (driven by the machine learning task), of the form of compression to use, and of the costs and constraints to obey (driven by the target device). We describe an open-source toolkit that is primarily designed to be flexible and extensible, but which is also efficient in compression time and achieves state-of-the-art accuracy-compression curves, as demonstrated empirically over a number of deep net architectures. Mathematically, this is achieved by formulating compression as a constrained optimization using auxiliary variables that facilitate separability, and solving it via a penalty method and alternating optimization, which results in a "learning-compression"(LC) algorithm. This alternates a "learning"step over the original model, independent of the compression, and a "compression"step over the compressed parameters, independent of the dataset and task. Each step can typically be solved by reusing well-known algorithms, such as SGD or EM in the learning step, or SVD or k-means in the compression step, and this makes the algorithm flexible and extensible. The toolkit is available at https://github.com/UCMerced-ML/LC-model-compression.
CITATION STYLE
Idelbayev, Y., & Carreira-Perpiñán, M. A. (2021). LC: A Flexible, Extensible Open-Source Toolkit for Model Compression. In International Conference on Information and Knowledge Management, Proceedings (pp. 4504–4514). Association for Computing Machinery. https://doi.org/10.1145/3459637.3482005
Mendeley helps you to discover research relevant for your work.