Making large-scale systems observable - another inescapable step towards exascale

5Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

The effective mastering of extremely parallel HPC system is impossible without deep understanding of all internal processes and behavior of the whole diversity of the components: computing processors and nodes, memory usage, interconnect, storage, whole software stack, cooling and so forth in detail. There are numerous visualization tools that provide information on certain components and system as a whole, but most of them have severe issues that limit appliance in real life, thus becoming inacceptable for the future system scales. Predefined monitoring systems and data sources, lack of dynamic on-the-fly reconfiguration, inflexible visualization and screening options are among the most popular issues. The proposed approach to monitoring data processing resolves the majority of known problems, providing a scalable and flexible solution based on any available monitoring systems and other data sources. The approach implementation is successfully used in every-day practice of the largest in Russia supercomputer center of Moscow State University.

Cite

CITATION STYLE

APA

Nikitenko, D. A., Zhumatiy, S. A., & Shvets, P. A. (2016). Making large-scale systems observable - another inescapable step towards exascale. Supercomputing Frontiers and Innovations, 3(2), 72–79. https://doi.org/10.14529/jsfi160205

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free