Performance Optimization in the LLM World 2024

5Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The popularity and adoption of large language models (LLM) like ChatGPT has evolved rapidly. LLM pre-training is expensive. ChatGPT is estimated to cost over 700,000 per day to operate, and using GPT-4 to support customer service can cost a small business over 21,000 a month. The high infrastructure and financial costs, coupled with the specialized talent required, make LLM technology inaccessible to most organizations. For instance, the up-front costs include the emissions generated to manufacture the relevant hardware and the cost to run that hardware during the training procedure, both while the machines are operating at full capacity and while they are not. The best estimate of the dynamic computing cost in the case of GPT-3, the model behind the original ChatGPT, is approximately 1,287,000 kWh, or 552 tons of carbon dioxide. The goal of this workshop is to address the urgency of reducing energy consumption of LLM applications, by bringing together researchers from the academia and industry to share their experience and insights in performance engineering in the LLM world.

Cite

CITATION STYLE

APA

Chow, K., Tang, Y., Lyu, Z., Rajput, A., & Ban, K. (2024). Performance Optimization in the LLM World 2024. In ICPE 2024 - Companion of the 15th ACM/SPEC International Conference on Performance Engineering (pp. 156–157). Association for Computing Machinery, Inc. https://doi.org/10.1145/3629527.3651436

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free