We present a multivariate analysis technique called Co-plot that is especially suitable for samples with many variables and relatively few observations, as the data about workloads often is. Observations and variables are analyzed simultaneously. We find three stable clusters of highly correlated variables, but that the workloads themselves, on the other hand, are rather different from one another. Synthetic models for workload generation are also analyzed, and found to be reasonable; however, each model usually covers well one machine type. This leads us to conclude that a parameterized model of parallel workloads should be built, and we describe guidelines for such a model. Another feature that the models lack is self-similarity: We demonstrate that production logs exhibit this phenomenon in several attributes of the workload, and in contrast that the none of the synthetic models do.
CITATION STYLE
Talby, D., Feitelson, D. G., & Raveh, A. (1999). Comparing logs and models of parallel workloads using the co-plot method. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1659, pp. 43–66). Springer Verlag. https://doi.org/10.1007/3-540-47954-6_3
Mendeley helps you to discover research relevant for your work.