The problem of effective resource utilization is very challenging nowadays, especially for HPC centers running top-level supercomputing facilities with high energy consumption and significant number of workgroups. The weakness of many system monitoring based approaches to efficiency study is the basic orientation on professionals and analysis of specific jobs with low availability for regular users. The proposed all-round performance analysis approach, covering single application performance, project-level and overall system resource utilization based on system monitoring data that promises to be an effective and low cost technique aimed at all types of HPC center users. Every user of HPC center can access details on any of his executed jobs to better understand application behavior and sequences of job runs including scalability study, helping in turn to perform appropriate optimizations and implement co-design techniques. Taking into consideration all levels (user, project manager, administrator), the approach aids to improve output of HPC centers.
CITATION STYLE
Nikitenko, D., Stefanov, K., Zhumatiy, S., Voevodin, V., Teplov, A., & Shvets, P. (2016). System monitoring-based holistic resource utilization analysis for every user of a large HPC center. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10049 LNCS, pp. 305–318). Springer Verlag. https://doi.org/10.1007/978-3-319-49956-7_24
Mendeley helps you to discover research relevant for your work.