Debugging large-scale, long-running parallel programs

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Cyclic debugging depicts error detection techniques, where programs are iteratively executed to identify the original reason for incorrect runtime behavior. This characteristic is especially problematic for large-scale, long-running parallel programs concerning the requirements in time and processing resources and the associated computing costs. A solution to these problems is offered by a combination of techniques, which use the event graph model as the main representation of parallel program behavior. On the one hand, the number of deployed processes can be reduced with process isolation, where only a subset of the original processes are executed during debugging. On the other hand, an integrated checkpointing mechanism allows to extract limited periods of execution time, or to start subsequent program executions at intermediate points. Additionally, the event graph offers equivalent program execution in case of nondeterminism, as well as the possibility to investigate the effects of program perturbation induced by the observation functionality. © Springer-Verlag Berlin Heidelberg 2002.

Cite

CITATION STYLE

APA

Kranzlmüller, D., Thoai, N., & Volkert, J. (2002). Debugging large-scale, long-running parallel programs. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2330, 913–922. https://doi.org/10.1007/3-540-46080-2_96

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free