Predicting the Performance Impact of Different Fat-Tree Configurations

13Citations
Citations of this article
36Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The fat-tree topology is one of the most commonly used network topologies in HPC systems. Vendors support several options that can be configured when deploying fat-tree networks on production systems, such as link bandwidth, number of rails, number of planes, and tapering. This paper showcases the use of simulations to compare the impact of these design options on representative production HPC applications, libraries, and multi-job workloads. We present advances in the TraceR-CODES simulation framework that enable this analysis and evaluate its prediction accuracy against experiments on a production fat-tree network. In order to understand the impact of different network configurations on various anticipated scenarios, we study workloads with different communication patterns, computation-to-communication ratios, and scaling characteristics. Using multi-job workloads, we also study the impact of inter-job interference on performance and compare the cost-performance tradeoffs. CCS CONCEPTS • Networks →Network performance modeling; Network simulations; Network performance analysis;

Cite

CITATION STYLE

APA

Jain, N., Bhatele, A., Howell, L. H., Bohme, D., Karlin, I., Leon, E. A., … Leininger, M. L. (2017). Predicting the Performance Impact of Different Fat-Tree Configurations. In International Conference for High Performance Computing, Networking, Storage and Analysis, SC (Vol. 2017-November). IEEE Computer Society. https://doi.org/10.1145/3126908.3126967

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free