Cost-Aware Neural Network Splitting and Dynamic Rescheduling for Edge Intelligence

3Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

With the rise of IoT devices and the necessity of intelligent applications, inference tasks are often offloaded to the cloud due to the computation limitation of the end devices. Yet, requests to the cloud are costly in terms of latency, and therefore a shift of the computation from the cloud to the network's edge is unavoidable. This shift is called edge intelligence and promises lower latency, among other advantages. However, some algorithms, like deep neural networks, are computationally intensive, even for local edge servers (ES). To keep latency low, such DNNs can be split into two parts and distributed between the ES and the cloud. We present a dynamic scheduling algorithm that takes real-Time parameters like the clock speed of the ES, bandwidth, and latency into account and predicts the optimal splitting point regarding latency. Furthermore, we estimate the overall costs for the ES and cloud during run-Time and integrate them into our prediction and decision models. We present a cost-Aware prediction of the splitting point, which can be tuned with a parameter toward faster response or lower costs.

Cite

CITATION STYLE

APA

Luger, D., Aral, A., & Brandic, I. (2023). Cost-Aware Neural Network Splitting and Dynamic Rescheduling for Edge Intelligence. In EdgeSys 2023 - Proceedings of the 6th International Workshop on Edge Systems, Analytics and Networking, Part of EuroSys 2023 (pp. 42–47). Association for Computing Machinery, Inc. https://doi.org/10.1145/3578354.3592871

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free