This paper proposes a new architecture that strategically harvests the untapped compute capacity of the SmartNICs to offload transient microservices workload spikes, thereby reducing the SLA violations while providing better performance/energy consumption. This is particularly important for ML workloads at Edge deployments with stringent SLA requirements. Usage of the untapped compute capacity is more favorable than deploying extra servers, as SmartNICs are economically and operationally more desirable. We propose Spike-Offload, a low-cost and scalable platform that leverages machine learning to predict the spikes and orchestrates seamless offloading of generic microservices workloads to the SmartNICs, eliminating the need for pre-deploying expensive host servers and their under-utilization. Our SpikeOffload evaluation shows that SLA violations can be reduced by up to 20% for specific workloads. Furthermore, we demonstrate that for specific workloads our approach can potentially reduce capital expenditure (CAPEX) by more than 40%. Also, performance per unit energy consumption can be improved by upto 2X.
CITATION STYLE
Tootaghaj, D. Z., Mercian, A., Adarsh, V., Sharifian, M., & Sharma, P. (2022). SmartNICs at edge for transient compute elasticity. In DistributedML 2022 - Proceedings of the 3rd International Workshop on Distributed Machine Learning, Part of CoNEXT 2022 (pp. 9–15). Association for Computing Machinery, Inc. https://doi.org/10.1145/3565010.3569065
Mendeley helps you to discover research relevant for your work.