Benchmarking the Performance of Accelerators on National Cyberinfrastructure Resources for Artificial Intelligence / Machine Learning Workloads

Abhinand Nasari; Hieu Le; Richard Lawrence; Zhenhua He; Xin Yang; Mario Krell; Alex Tsyplikhin; Mahidhar Tatineni; Tim Cockerill; Lisa Perez; Dhruva Chakravorty; Honggao Liu

Conference ProceedingsOPEN ACCESS

Benchmarking the Performance of Accelerators on National Cyberinfrastructure Resources for Artificial Intelligence / Machine Learning Workloads

PEARC 2022 Conference Series - Practice and Experience in Advanced Research Computing 2022 - Revolutionary: Computing, Connections, You (2022)

DOI: 10.1145/3491418.3530772

18Citations

9Readers

Get full text

Abstract

Upcoming regional and National Science Foundation (NSF)-funded Cyberinfrastructure (CI) resources will give researchers opportunities to run their artificial intelligence / machine learning (AI/ML) workflows on accelerators. To effectively leverage this burgeoning CI-rich landscape, researchers need extensive benchmark data to maximize performance gains and map their workflows to appropriate architectures. This data will further assist CI administrators, NSF program officers, and CI allocation-reviewers make informed determinations on CI-resource allocations. Here, we compare the performance of two very different architectures: the commonly used Graphical Processing Units (GPUs) and the new generation of Intelligence Processing Units (IPUs), by running training benchmarks of common AI/ML models. We leverage the maturity of software stacks, and the ease of migration among these platforms to learn that performance and scaling are similar for both architectures. Exploring training parameters, such as batch size, however finds that owing to memory processing structures, IPUs run efficiently with smaller batch sizes, while GPUs benefit from large batch sizes to extract sufficient parallelism in neural network training and inference. This comes with different advantages and disadvantages as discussed in this paper.As such considerations of inference latency, inherent parallelism and model accuracy will play a role in researcher selection of these architectures. The impact of these choices on a representative image compression model system is discussed.

Author supplied keywords

Cite

CITATION STYLE

APA

Nasari, A., Le, H., Lawrence, R., He, Z., Yang, X., Krell, M., … Liu, H. (2022). Benchmarking the Performance of Accelerators on National Cyberinfrastructure Resources for Artificial Intelligence / Machine Learning Workloads. In PEARC 2022 Conference Series - Practice and Experience in Advanced Research Computing 2022 - Revolutionary: Computing, Connections, You. Association for Computing Machinery, Inc. https://doi.org/10.1145/3491418.3530772

Benchmarking the Performance of Accelerators on National Cyberinfrastructure Resources for Artificial Intelligence / Machine Learning Workloads

Abstract

Author supplied keywords

Cite

Register to see more suggestions