Characterizing Application Runtime Behavior from System Logs and Metrics

  • Gunasekaran R
  • Dillow D
  • Shipman G
  • et al.
N/ACitations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

Large-scale systems are heavily-shared resource environments where a mix of applications run concurrently and compete for network and storage resources. It is essential to characterize the runtime behavior of these applications in order to provision system resources and understand the impact of resource contention on an application’s performance. In this paper, we study the use of zero- and low-overhead system logs and other system metric data for characterizing the runtime behavior of several applications. We present our preliminary work on estimating an application’s I/O demands by observing its file system usage patterns over multiple runs, and on estimating an application’s network utilization by observing link-layer error logs. We also present preliminary findings on using such information in making context-sensitive scheduling decisions that minimize potentially negative interactions between applications competing for shared resources. Our analysis is based on four months of system log data collected on one of the world’s largest supercomputing facilities, the Jaguar XT5 petaflop system at Oak Ridge National Laboratory.

Cite

CITATION STYLE

APA

Gunasekaran, R., Dillow, D., Shipman, G., Vuduc, R., & Chow, E. (2011). Characterizing Application Runtime Behavior from System Logs and Metrics. In Proceedings of the 1st International Workshop on Characterizing Applications for Heterogeneous Exascale Systems (CACHES). Tucson, AZ, USA. Retrieved from http://www.mcs.anl.gov/events/workshops/caches/2011/program.html

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free