Astra: Exploiting Predictability to Optimize Deep Learning

Muthian Sivathanu; Tapan Chugh; Sanjay S. Singapuram; Lidong Zhou

Conference ProceedingsOPEN ACCESS

Astra: Exploiting Predictability to Optimize Deep Learning

International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS (2019) 909-923

DOI: 10.1145/3297858.3304072

34Citations

59Readers

Get full text

Abstract

We present Astra, a compilation and execution framework that optimizes execution of a deep learning training job. Instead of treating the computation as a generic data flow graph, Astra exploits domain knowledge about deep learning to adopt a custom approach to compiler optimization. The key insight in Astra is to exploit the unique repetitiveness and predictability of a deep learning job, to perform online exploration of the optimization state space in a workconserving manner while making progress on the training job. This dynamic state space exploration in Astra uses lightweight profiling and indexing of profile data, coupled with several techniques to prune the exploration state space. Effectively, the execution layer custom-wires the infrastructure end-to-end for each job and hardware, while keeping the compiler simple and maintainable. We have implemented Astra in two popular deep learning frameworks, PyTorch and Tensorflow. On state-of-the-art deep learning models, we show that Astra improves end-toend performance of deep learning training by up to 3x, while approaching the performance of hand-optimized implementations such as cuDNN where available. Astra also significantly outperforms static compilation frameworks such as Tensorflow XLA both in performance and robustness.

Author supplied keywords

Cite

CITATION STYLE

APA

Sivathanu, M., Chugh, T., Singapuram, S. S., & Zhou, L. (2019). Astra: Exploiting Predictability to Optimize Deep Learning. In International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS (pp. 909–923). Association for Computing Machinery. https://doi.org/10.1145/3297858.3304072

Astra: Exploiting Predictability to Optimize Deep Learning

Abstract

Author supplied keywords

Cite

Register to see more suggestions