Calculon: a Methodology and Tool for High-Level Codesign of Systems and Large Language Models

N/ACitations
Citations of this article
37Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper presents a parameterized analytical performance model of transformer-based Large Language Models (LLMs) for guiding high-level algorithm-architecture codesign studies. This model de-rives from an extensive survey of performance optimizations that have been proposed for the training and inference of LLMs; the model's parameters capture application characteristics, the hardware system, and the space of implementation strategies. With such a model, we can systematically explore a joint space of hardware and software configurations to identify optimal system designs under given constraints, like the total amount of system memory. We implemented this model and methodology in a Python-based open-source tool called Calculon. Using it, we identified novel system designs that look significantly different from current inference and training systems, showing quantitatively the estimated potential to achieve higher efficiency, lower cost, and better scalability.

Cite

CITATION STYLE

APA

Isaev, M., McDonald, N., Dennison, L., & Vuduc, R. (2023). Calculon: a Methodology and Tool for High-Level Codesign of Systems and Large Language Models. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC. IEEE Computer Society. https://doi.org/10.1145/3581784.3607102

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free