Creating Edge AI from Cloud-based LLMs

Qifei Dong; Xiangliang Chen; Mahadev Satyanarayanan

Conference ProceedingsOPEN ACCESS

Creating Edge AI from Cloud-based LLMs

HOTMOBILE 2024 - Proceedings of the 2024 25th International Workshop on Mobile Computing Systems and Applications (2024) 8-13

DOI: 10.1145/3638550.3641126

4Citations

12Readers

Abstract

Cyber-human and cyber-physical systems have tight end-to-end latency bounds, typically on the order of a few tens of milliseconds. In contrast, cloud-based large-language models (LLMs) have end-to-end latencies that are two to three orders of magnitude larger. This paper shows how to bridge this large gap by using LLMs as offline compilers for creating task-specific code that avoids LLM accesses. We provide three case studies as proofs of concept, and discuss the challenges in generalizing this technique to broader uses.

Author supplied keywords

Cite

CITATION STYLE

APA

Dong, Q., Chen, X., & Satyanarayanan, M. (2024). Creating Edge AI from Cloud-based LLMs. In HOTMOBILE 2024 - Proceedings of the 2024 25th International Workshop on Mobile Computing Systems and Applications (pp. 8–13). Association for Computing Machinery, Inc. https://doi.org/10.1145/3638550.3641126

Creating Edge AI from Cloud-based LLMs

Abstract

Author supplied keywords

Cite

Register to see more suggestions